About Us

Bill Ward[_2_]

On Sat, 12 Dec 2009 13:44:09 -0600, TUKA wrote:

On 2009-12-12, Bill Ward wrote:
On Sat, 12 Dec 2009 09:53:43 -0500, jmfbahciv wrote:

Bill Ward wrote:
On Fri, 11 Dec 2009 09:03:37 -0500, jmfbahciv wrote:

Bill Ward wrote:
On Thu, 10 Dec 2009 08:01:19 -0500, jmfbahciv wrote:

Bill Ward wrote:
On Wed, 09 Dec 2009 09:13:35 -0500, jmfbahciv wrote:

On Tue, 08 Dec 2009 08:41:40 -0500, jmfbahciv wrote:

not the
original paper forms. I'm assuming there will be keyboarding
errors, wrong dates, etc, which should be checked against the
paper originals to avoid propagating unambiguous errors. Range
checks and other automated methods could be used to flag
suspected errors for human intervention. I am specifically
excluding any "corrections" based on opinion or assumptions such
as UHI, etc.

All of this requires code that massages the real data. So you
aren't talking about raw data here either.

It doesn't "require" code, it requires a consistent, transparent,
algorithm, whether done by machine or not.

Which requires code if you're putting in into bits and storing the
results on a system which can be accessed by the rest of the world's
computers.

It's easier that way. But the most important thing is to avoid
corrupting the data.

There are lots of ways it can be corrupted. A lot them don't even
require a human being.

Raw data is numbers not nice wordage in English ASCII.
And I'm talking about adding the labels, dates, locations and
other metadata required to make it usable. By your definition,
"raw data" would be useless.

Then the data has to be massaged by code which has to be written,
tested, debugged, and load tested. This takes manpower, money,
time, and maintenance. By your definition, the bits put on a
public server will not be data but a report of the data.

I think that's your definition. I said,"I'm talking about
unadjusted digital versions of the 'raw data'", and you took issue
with it.

No, didn't talk about that. You want prettied up and reformatted so
anybody can read it and understand what it is. That takes code and
massages the raw data.

"Adjusting", or "massaging" is different from "reformatting".

All reformatting is massaging. If you are not doing a bit-for- bit
copy, you are massaging the file.

You use strange definitions, but OK.

By your definition they'd be useless:

And now for the scores: 7 to 3; 2 to 1; and 21 to 7.

There's your "raw data", but it's not all that useful.
That is not raw data. You've typed it in and its format is ASCII.

You seem to be straining at gnats here. When does the mercury
position become "raw data" to you? When does it stop being "raw
data"?

Raw data is the original collection of facts. Prettying numbers up to
be displayed on a TTY screen or hardcopy paper requires massaging if
those bits are stored on computer gear.

I didn't see an answer to my question there. At what point does the
value representing the position of the Hg meniscus become "raw data"?

"[O]riginal collection of facts" is a bit ambiguous. Is it when the
observer reads it, when he initially writes it down on a form, when he
keys it into a computer memory, when he saves it to permanent media,
when a herdcopy is printed...? If you're going to make up
definitions, you at least need to be specific and consistent.

I define "raw data" as any copy of the original reading that carries
exactly the same information as the original reading, no matter what
format it's in. If any information has changed, it's no longer raw
data. If the information is the same, but the data has been
reformatted, labeled, columnized, "prettied up", sorted, or any other
information preserving transformation, it's still raw data, since the
information is unchanged.

Do you see any problem with that?

Yes, because of one thing. The verification that it is unchanged. Which
is why any science class trains you to enter the data directly into
whatever will be the retention mechanism. In olden days, that was a log
book. Today, that is typically some sort of digital form.

If there is no transcription, even digital, then I have no problem with
it. But if it is flowed into another form, there is some potential for
error. Which is why you don't destroy the raw data.

And why I stipulated information preserving transformations only. Of
course it has to be checked. It's been a long time, but I can remember
independent key entering of data by two persons to flag errors.

I hope you realize I'm not condoning discarding original data, I just
want usable backups to be available of the raw data used in published
papers.

Financial transactions seem to work pretty much OK, and that's only
money, not irreplaceable physical data.

TUKA

On 2009-12-12, Bill Ward wrote:
On Sat, 12 Dec 2009 13:44:09 -0600, TUKA wrote:

On 2009-12-12, Bill Ward wrote:
On Sat, 12 Dec 2009 09:53:43 -0500, jmfbahciv wrote:

Bill Ward wrote:
On Fri, 11 Dec 2009 09:03:37 -0500, jmfbahciv wrote:

Bill Ward wrote:
On Thu, 10 Dec 2009 08:01:19 -0500, jmfbahciv wrote:

Bill Ward wrote:
On Wed, 09 Dec 2009 09:13:35 -0500, jmfbahciv wrote:

On Tue, 08 Dec 2009 08:41:40 -0500, jmfbahciv wrote:

not the
original paper forms. I'm assuming there will be keyboarding
errors, wrong dates, etc, which should be checked against the
paper originals to avoid propagating unambiguous errors. Range
checks and other automated methods could be used to flag
suspected errors for human intervention. I am specifically
excluding any "corrections" based on opinion or assumptions such
as UHI, etc.

All of this requires code that massages the real data. So you
aren't talking about raw data here either.

It doesn't "require" code, it requires a consistent, transparent,
algorithm, whether done by machine or not.

Which requires code if you're putting in into bits and storing the
results on a system which can be accessed by the rest of the world's
computers.

It's easier that way. But the most important thing is to avoid
corrupting the data.

There are lots of ways it can be corrupted. A lot them don't even
require a human being.

Raw data is numbers not nice wordage in English ASCII.
And I'm talking about adding the labels, dates, locations and
other metadata required to make it usable. By your definition,
"raw data" would be useless.

Then the data has to be massaged by code which has to be written,
tested, debugged, and load tested. This takes manpower, money,
time, and maintenance. By your definition, the bits put on a
public server will not be data but a report of the data.

I think that's your definition. I said,"I'm talking about
unadjusted digital versions of the 'raw data'", and you took issue
with it.

No, didn't talk about that. You want prettied up and reformatted so
anybody can read it and understand what it is. That takes code and
massages the raw data.

"Adjusting", or "massaging" is different from "reformatting".

All reformatting is massaging. If you are not doing a bit-for- bit
copy, you are massaging the file.

You use strange definitions, but OK.

By your definition they'd be useless:

And now for the scores: 7 to 3; 2 to 1; and 21 to 7.

There's your "raw data", but it's not all that useful.
That is not raw data. You've typed it in and its format is ASCII.

You seem to be straining at gnats here. When does the mercury
position become "raw data" to you? When does it stop being "raw
data"?

Raw data is the original collection of facts. Prettying numbers up to
be displayed on a TTY screen or hardcopy paper requires massaging if
those bits are stored on computer gear.

I didn't see an answer to my question there. At what point does the
value representing the position of the Hg meniscus become "raw data"?

"[O]riginal collection of facts" is a bit ambiguous. Is it when the
observer reads it, when he initially writes it down on a form, when he
keys it into a computer memory, when he saves it to permanent media,
when a herdcopy is printed...? If you're going to make up
definitions, you at least need to be specific and consistent.

I define "raw data" as any copy of the original reading that carries
exactly the same information as the original reading, no matter what
format it's in. If any information has changed, it's no longer raw
data. If the information is the same, but the data has been
reformatted, labeled, columnized, "prettied up", sorted, or any other
information preserving transformation, it's still raw data, since the
information is unchanged.

Do you see any problem with that?

Yes, because of one thing. The verification that it is unchanged. Which
is why any science class trains you to enter the data directly into
whatever will be the retention mechanism. In olden days, that was a log
book. Today, that is typically some sort of digital form.

If there is no transcription, even digital, then I have no problem with
it. But if it is flowed into another form, there is some potential for
error. Which is why you don't destroy the raw data.

And why I stipulated information preserving transformations only. Of
course it has to be checked. It's been a long time, but I can remember
independent key entering of data by two persons to flag errors.

I hope you realize I'm not condoning discarding original data, I just
want usable backups to be available of the raw data used in published
papers.

Financial transactions seem to work pretty much OK, and that's only
money, not irreplaceable physical data.

OK, I misunderstood what you were saying. I am sorry I raised the
point, it is obvious you had thought of it.

Do you have any analysis -- or is such analysis possible -- of the state
of the East Anglia data? Has any information been made available that you
know of, or has anyone done an exposition?

--
The sun, with all those planets revolving around it and dependent on it,
can still ripen a bunch of grapes as if it had nothing else in the
universe to do. -- Galileo

I M @ good guy

On Sat, 12 Dec 2009 09:53:43 -0500, jmfbahciv jmfbahciv@aol wrote:

Bill Ward wrote:
On Fri, 11 Dec 2009 09:03:37 -0500, jmfbahciv wrote:

Bill Ward wrote:
On Thu, 10 Dec 2009 08:01:19 -0500, jmfbahciv wrote:

Bill Ward wrote:
On Wed, 09 Dec 2009 09:13:35 -0500, jmfbahciv wrote:

Bill Ward wrote:
On Tue, 08 Dec 2009 08:41:40 -0500, jmfbahciv wrote:

Bill Ward wrote:
On Mon, 07 Dec 2009 08:38:20 -0500, jmfbahciv wrote:

Bill Ward wrote:
On Sun, 06 Dec 2009 21:43:15 -0800, isw wrote:

In article ,
7 wrote:

Eric Gisin wrote:

Positive cloud feedback is the key to Climate Alarmism, but
the science behind it is questionable. Note how the
alarmists cannot respond to this important issue, other than
with insane rants and conspiracies.

http://www.drroyspencer.com/2009/12/can-global-warming-
predictions-
be-
tested-with-observations-of-the-real-climate-system/
December 6, 2009, 08:19:36 | Roy W. Spencer, Ph. D.

In a little over a week I will be giving an invited paper at
the Fall meeting of the American Geophysical Union (AGU) in
San Francisco, in a special session devoted to feedbacks in
the climate system. If you don't already know, feedbacks are
what will determine whether anthropogenic global warming is
strong or weak, with cloud feedbacks being the most
uncertain of all.

In the 12 minutes I have for my presentation, I hope to
convince as many scientists as possible the futility of
previous attempts to estimate cloud feedbacks in the climate
system. And unless we can measure cloud feedbacks in nature,
we can not test the feedbacks operating in computerized
climate models.

WHAT ARE FEEDBACKS?
Systems with feedback have characteristic time constants,
oscillations and dampening characteristics all of which are
self evident and measurable. Except if you are an AGW
holowarming nut and fruitcake. You'll just have to make up
some more numbers and bully more publications to get it past
peer review.

Climate science needs more transparency.

Thats easy:

1. Put all your emails on public ftp servers.

2. Put all the raw climate data in public ftp servers so that
it can be peer reviewed.
I don't have any problem at all with *honest* peer review.
What I do have a BIG problem with is making the data available
to people who are certainly NOT "peers" (in the sense of
having little or no scientific training in any field, let
alone a specialization in anything relating to climatology),
who furthermore have a real anti-warming agenda, and who will,
either willfully or ignorantly, misinterpret the data to suit
their purposes, and spread the resulting disinformation far
and wide.

How do you propose to prevent that?
Excellent question.
Yup.

First, I'd write a clear, coherent, complete description and
explanation of the exact mechanism by which CO2 is thought to
increase surface temperatures. I'd aim it at the level of a
person who's had high school physics, but has forgotten much of
it. I'd make the best, most honest case I could, showing and
explaining the evidence both supporting and against the
hypothesis.

Then I'd publish the first draft and invite review by anyone
who feels qualified to comment. The second draft would
honestly answer the issues and misunderstandings raised in
those comments, again keeping the language and concepts
accessible and convincing to any interested high school physics
graduate.

The process would iterate until a sufficiently understandable,
unambiguous case could be made for AGW to convince most people,
or the hypothesis is clearly falsified.

IOW, cut the condescending, supercilious crap and have an
honest, open debate. Focus on learning how the climate system
actually works rather than trying to advance a political agenda
by frightening gullible people with scare tactics.

And the scientist is no longer doing his/her science. To make
data available requires a maintenance staff before it's written
to the public disk.
Don't you think it might be a good idea to do some data QC before
it's written to disks distributed to anyone? I'd think that's
part of the scientist's job. Why should the public see anything
different from the same disks the research is based on? The more
eyes looking, the earlier discrepancies can be resolved. Science
is supposed to be an open process, not a quasi-religious
ceremony.
What discrepancies? We're talking about science data, not a doc
that can be proof-read.
If that's the case, why not just post it? Why try to hide it?
What are you talking about now? I've been trying to discuss the
problems with the suggestion that any science data be put on a
public server with documentation describing it so a non-scientist
would understand the data. Frankly, I think this (documenting it)
is impossible but there are amazing writers in the science biz.
I'm talking about making the data available online to whoever wants
to review it, not keeping it from those who might disagree with the
conclusions the IPCC is promoting. There are no "wrong people" who
shouldn't have access to the data, and there's no need to be sure
they "understand" it in the "correct" way. That's not up to you, me,
or anyone else to decide. It's public property.

It seems a shame for Steve McIntyre to have to do the QC by
reverse engineering secret analytical processes after the fact.

ARe you talking about raw data? I don't see how you can QC raw
data.
Organize it into files suitable for archiving and searching, then
check for typos and transcription errors.
WTF are you talking about? There can't be typos in raw data, let
alone transcription errors.
I'm talking about unadjusted digital versions of the "raw data",
No, you are not. See below.
You know what I'm talking about, and I don't? That's quite a gift.
Yes. I know what you're not talking about.

Was that a typo, or you actually agreeing with me now?

It's clear you have no idea what processes are involved w.r.t. putting
readable bits on a computer system.

You might be surprised.

Not really.

not the
original paper forms. I'm assuming there will be keyboarding errors,
wrong dates, etc, which should be checked against the paper
originals to avoid propagating unambiguous errors. Range checks and
other automated methods could be used to flag suspected errors for
human intervention. I am specifically excluding any "corrections"
based on opinion or assumptions such as UHI, etc.
All of this requires code that massages the real data. So you aren't
talking about raw data here either.
It doesn't "require" code, it requires a consistent, transparent,
algorithm, whether done by machine or not.
Which requires code if you're putting in into bits and storing the
results on a system which can be accessed by the rest of the world's
computers.

It's easier that way. But the most important thing is to avoid
corrupting the data.

There are lots of ways it can be corrupted. A lot them don't even
require a human being.

Raw data is numbers not nice wordage in English ASCII.
And I'm talking about adding the labels, dates, locations and other
metadata required to make it usable. By your definition, "raw data"
would be useless.

Then the data has to be massaged by code which has to be written,
tested, debugged, and load tested. This takes manpower, money, time,
and maintenance. By your definition, the bits put on a public server
will not be data but a report of the data.
I think that's your definition. I said,"I'm talking about unadjusted
digital versions of the 'raw data'", and you took issue with it.
No, didn't talk about that. You want prettied up and reformatted so
anybody can read it and understand what it is. That takes code and
massages the raw data.

"Adjusting", or "massaging" is different from "reformatting".

All reformatting is massaging. If you are not doing a bit-for-
bit copy, you are massaging the file.

By your definition they'd be useless:

And now for the scores: 7 to 3; 2 to 1; and 21 to 7.

There's your "raw data", but it's not all that useful.
That is not raw data. You've typed it in and its format is ASCII.

You seem to be straining at gnats here. When does the mercury position
become "raw data" to you? When does it stop being "raw data"?

Raw data is the original collection of facts. Prettying numbers
up to be displayed on a TTY screen or hardcopy paper requires
massaging if those bits are stored on computer gear.

If you want to call the verification and formatting "massaging", fine,
but if it's not done, the data is unusable.
Exactly. It's unusable to most people except those who run code to use
it as input (which is what scientists do).

And many others who might be seriously interested.

This thread has been talking about non-scientists having access to
any data which was collected; further constraints were declared
that the data had to be prettied up and completely described so
that anybody could access the data and know what it meant. One
of you made a further requirement that the scientist do all that
work. Ptui.

That should be one of the
deliverables in the data collection contract.

You don't know what you're talking about.

And you're assuming facts not in evidence.

Actually, I'm not assuming anything. I'm talking about moving bits
and presenting them to non-expert readers. I know a lot about this
kind of thing because I did that kind of work for 25 years.
It's you that's worried about "non-expert" readers, not me. I just
want it accessible in a usable form. You don't need to sugar coat it.
Your kind of usable form requires the raw data to be massaged before
storing it on a public forum.

I guess that depends on your definition of "massaging". As long as it
doesn't corrupt the data, I don't care what you call it, but the simpler,
the better.

You can't tell if the data's been corrupted if it's been reformatted.
You have to have QA specialist checking.

I'm definitely not talking about a contract.

Then who's paying for it? If it's not taxpayers, then I really don't
care how it's done. If it is from taxes, then there better be an
enforceable contract in place, or we'll be right back where we are
now.

Contract law is different in each and every country.
So? There are still enforceable contracts. How would you do
international business without them?
You sign a contract for each country or entity in which you want to do
business.

Exactly. Why were you trying to make an issue of such an obvious point?

You're the one who started to talk about contracts.

Which taxpayers do you think paid for the gathering of that data? Who
pays for the data the maritime business provides?
Don't know, don't care. Are you saying the IPCC is not tax-funded?

Where did our $50B go, then? I think grants are generally in the form
of contracts.

You don't even know how things get done.

Again, you might be surprised.

Not at all. You have no idea how much work is involved.

/BAH

You put way too much into what scientists
do and how little a non-professional can do and
understands.

Nothing has to be done to the data, just
make it available according to the law, and let
the recipients worry about the format.

It isn't just the data that is subject to the
FOIA, it is the whole ball of wax that public
money paid for, professional work is supposed
to be notated, even within the text of papers,
hiding anything is either hiding something,
or some kind of perversion about importance.

Bill Ward[_2_]

On Sat, 12 Dec 2009 15:23:50 -0600, TUKA wrote:

On 2009-12-12, Bill Ward wrote:
On Sat, 12 Dec 2009 13:44:09 -0600, TUKA wrote:

On 2009-12-12, Bill Ward wrote:
On Sat, 12 Dec 2009 09:53:43 -0500, jmfbahciv wrote:

Bill Ward wrote:
On Fri, 11 Dec 2009 09:03:37 -0500, jmfbahciv wrote:

Bill Ward wrote:
On Thu, 10 Dec 2009 08:01:19 -0500, jmfbahciv wrote:

Bill Ward wrote:
On Wed, 09 Dec 2009 09:13:35 -0500, jmfbahciv wrote:

On Tue, 08 Dec 2009 08:41:40 -0500, jmfbahciv wrote:

not the
original paper forms. I'm assuming there will be keyboarding
errors, wrong dates, etc, which should be checked against the
paper originals to avoid propagating unambiguous errors. Range
checks and other automated methods could be used to flag
suspected errors for human intervention. I am specifically
excluding any "corrections" based on opinion or assumptions
such as UHI, etc.

All of this requires code that massages the real data. So you
aren't talking about raw data here either.

It doesn't "require" code, it requires a consistent, transparent,
algorithm, whether done by machine or not.

Which requires code if you're putting in into bits and storing the
results on a system which can be accessed by the rest of the
world's computers.

It's easier that way. But the most important thing is to avoid
corrupting the data.

There are lots of ways it can be corrupted. A lot them don't even
require a human being.

Raw data is numbers not nice wordage in English ASCII.
And I'm talking about adding the labels, dates, locations and
other metadata required to make it usable. By your definition,
"raw data" would be useless.

Then the data has to be massaged by code which has to be
written, tested, debugged, and load tested. This takes
manpower, money, time, and maintenance. By your definition, the
bits put on a public server will not be data but a report of the
data.

I think that's your definition. I said,"I'm talking about
unadjusted digital versions of the 'raw data'", and you took
issue with it.

No, didn't talk about that. You want prettied up and reformatted
so anybody can read it and understand what it is. That takes code
and massages the raw data.

"Adjusting", or "massaging" is different from "reformatting".

All reformatting is massaging. If you are not doing a bit-for- bit
copy, you are massaging the file.

You use strange definitions, but OK.

By your definition they'd be useless:

And now for the scores: 7 to 3; 2 to 1; and 21 to 7.

There's your "raw data", but it's not all that useful.
That is not raw data. You've typed it in and its format is ASCII.

You seem to be straining at gnats here. When does the mercury
position become "raw data" to you? When does it stop being "raw
data"?

Raw data is the original collection of facts. Prettying numbers up
to be displayed on a TTY screen or hardcopy paper requires massaging
if those bits are stored on computer gear.

I didn't see an answer to my question there. At what point does the
value representing the position of the Hg meniscus become "raw data"?

"[O]riginal collection of facts" is a bit ambiguous. Is it when the
observer reads it, when he initially writes it down on a form, when
he keys it into a computer memory, when he saves it to permanent
media, when a herdcopy is printed...? If you're going to make up
definitions, you at least need to be specific and consistent.

I define "raw data" as any copy of the original reading that carries
exactly the same information as the original reading, no matter what
format it's in. If any information has changed, it's no longer raw
data. If the information is the same, but the data has been
reformatted, labeled, columnized, "prettied up", sorted, or any other
information preserving transformation, it's still raw data, since the
information is unchanged.

Do you see any problem with that?

Yes, because of one thing. The verification that it is unchanged.
Which is why any science class trains you to enter the data directly
into whatever will be the retention mechanism. In olden days, that was
a log book. Today, that is typically some sort of digital form.

If there is no transcription, even digital, then I have no problem
with it. But if it is flowed into another form, there is some
potential for error. Which is why you don't destroy the raw data.

And why I stipulated information preserving transformations only. Of
course it has to be checked. It's been a long time, but I can remember
independent key entering of data by two persons to flag errors.

I hope you realize I'm not condoning discarding original data, I just
want usable backups to be available of the raw data used in published
papers.

Financial transactions seem to work pretty much OK, and that's only
money, not irreplaceable physical data.

OK, I misunderstood what you were saying. I am sorry I raised the point,
it is obvious you had thought of it.

No problem.

Do you have any analysis -- or is such analysis possible -- of the state
of the East Anglia data? Has any information been made available that
you know of, or has anyone done an exposition?

Just Steve and the Climateaudit gang. He seems to be taking it slow and
careful, as he should.

TUKA

On 2009-12-12, Bill Ward wrote:
On Sat, 12 Dec 2009 15:23:50 -0600, TUKA wrote:

On 2009-12-12, Bill Ward wrote:
On Sat, 12 Dec 2009 13:44:09 -0600, TUKA wrote:

On 2009-12-12, Bill Ward wrote:
On Sat, 12 Dec 2009 09:53:43 -0500, jmfbahciv wrote:

Bill Ward wrote:
On Fri, 11 Dec 2009 09:03:37 -0500, jmfbahciv wrote:

Bill Ward wrote:
On Thu, 10 Dec 2009 08:01:19 -0500, jmfbahciv wrote:

Bill Ward wrote:
On Wed, 09 Dec 2009 09:13:35 -0500, jmfbahciv wrote:

On Tue, 08 Dec 2009 08:41:40 -0500, jmfbahciv wrote:

not the
original paper forms. I'm assuming there will be keyboarding
errors, wrong dates, etc, which should be checked against the
paper originals to avoid propagating unambiguous errors. Range
checks and other automated methods could be used to flag
suspected errors for human intervention. I am specifically
excluding any "corrections" based on opinion or assumptions
such as UHI, etc.

All of this requires code that massages the real data. So you
aren't talking about raw data here either.

It doesn't "require" code, it requires a consistent, transparent,
algorithm, whether done by machine or not.

Which requires code if you're putting in into bits and storing the
results on a system which can be accessed by the rest of the
world's computers.

It's easier that way. But the most important thing is to avoid
corrupting the data.

There are lots of ways it can be corrupted. A lot them don't even
require a human being.

Raw data is numbers not nice wordage in English ASCII.
And I'm talking about adding the labels, dates, locations and
other metadata required to make it usable. By your definition,
"raw data" would be useless.

Then the data has to be massaged by code which has to be
written, tested, debugged, and load tested. This takes
manpower, money, time, and maintenance. By your definition, the
bits put on a public server will not be data but a report of the
data.

I think that's your definition. I said,"I'm talking about
unadjusted digital versions of the 'raw data'", and you took
issue with it.

No, didn't talk about that. You want prettied up and reformatted
so anybody can read it and understand what it is. That takes code
and massages the raw data.

"Adjusting", or "massaging" is different from "reformatting".

All reformatting is massaging. If you are not doing a bit-for- bit
copy, you are massaging the file.

You use strange definitions, but OK.

By your definition they'd be useless:

And now for the scores: 7 to 3; 2 to 1; and 21 to 7.

There's your "raw data", but it's not all that useful.
That is not raw data. You've typed it in and its format is ASCII.

You seem to be straining at gnats here. When does the mercury
position become "raw data" to you? When does it stop being "raw
data"?

Raw data is the original collection of facts. Prettying numbers up
to be displayed on a TTY screen or hardcopy paper requires massaging
if those bits are stored on computer gear.

I didn't see an answer to my question there. At what point does the
value representing the position of the Hg meniscus become "raw data"?

"[O]riginal collection of facts" is a bit ambiguous. Is it when the
observer reads it, when he initially writes it down on a form, when
he keys it into a computer memory, when he saves it to permanent
media, when a herdcopy is printed...? If you're going to make up
definitions, you at least need to be specific and consistent.

I define "raw data" as any copy of the original reading that carries
exactly the same information as the original reading, no matter what
format it's in. If any information has changed, it's no longer raw
data. If the information is the same, but the data has been
reformatted, labeled, columnized, "prettied up", sorted, or any other
information preserving transformation, it's still raw data, since the
information is unchanged.

Do you see any problem with that?

Yes, because of one thing. The verification that it is unchanged.
Which is why any science class trains you to enter the data directly
into whatever will be the retention mechanism. In olden days, that was
a log book. Today, that is typically some sort of digital form.

If there is no transcription, even digital, then I have no problem
with it. But if it is flowed into another form, there is some
potential for error. Which is why you don't destroy the raw data.

And why I stipulated information preserving transformations only. Of
course it has to be checked. It's been a long time, but I can remember
independent key entering of data by two persons to flag errors.

I hope you realize I'm not condoning discarding original data, I just
want usable backups to be available of the raw data used in published
papers.

Financial transactions seem to work pretty much OK, and that's only
money, not irreplaceable physical data.

OK, I misunderstood what you were saying. I am sorry I raised the point,
it is obvious you had thought of it.

No problem.

Do you have any analysis -- or is such analysis possible -- of the state
of the East Anglia data? Has any information been made available that
you know of, or has anyone done an exposition?

Just Steve and the Climateaudit gang. He seems to be taking it slow and
careful, as he should.

Oh yes, of course. I wasn't thinking of data analysis, more like metadata
analysis. Like a description of what data is actually there, what has been
destroyed, what the prognosis is, etc.

--
I am a great believer in luck, and I find that the harder I work
the more luck I have. -- Thomas Jefferson

jmfbahciv

I M @ good guy wrote:
On Sat, 12 Dec 2009 09:53:43 -0500, jmfbahciv jmfbahciv@aol wrote:

snip clean off the tty

Which taxpayers do you think paid for the gathering of that data? Who
pays for the data the maritime business provides?
Don't know, don't care. Are you saying the IPCC is not tax-funded?

Where did our $50B go, then? I think grants are generally in the form
of contracts.

You don't even know how things get done.
Again, you might be surprised.

Not at all. You have no idea how much work is involved.

/BAH

You put way too much into what scientists
do and how little a non-professional can do and
understands.

Nothing has to be done to the data, just
make it available according to the law, and let
the recipients worry about the format.

That isn't what somebody insisted be done. I've been talking
about the suggestion that the data be prettied up and
documented with an explanation of the conclusions so that
a two-year-old can understand it. That last one is impossible
when the lab project is still in hypothesis-mode.

It isn't just the data that is subject to the
FOIA, it is the whole ball of wax that public
money paid for, professional work is supposed
to be notated, even within the text of papers,
hiding anything is either hiding something,
or some kind of perversion about importance.

The hiding is not the problem. The problem is politicians
using this as a basis for passing laws, tweaking economies,
stopping trade, and destroying nations and infrastructures.
another problem is a public who would rather believe in
conspiracies, unicorns, and outrageous fictions than
expend a tad of mental energy thinking.

If you want to solve this "cheating" problem, then solve
those two.

/BAH

jmfbahciv

Bill Ward wrote:
On Sat, 12 Dec 2009 09:53:43 -0500, jmfbahciv wrote:

Bill Ward wrote:
On Fri, 11 Dec 2009 09:03:37 -0500, jmfbahciv wrote:

Bill Ward wrote:
On Thu, 10 Dec 2009 08:01:19 -0500, jmfbahciv wrote:

snip clean my tty screen

I think that's your definition. I said,"I'm talking about unadjusted
digital versions of the 'raw data'", and you took issue with it.

No, didn't talk about that. You want prettied up and reformatted so
anybody can read it and understand what it is. That takes code and
massages the raw data.
"Adjusting", or "massaging" is different from "reformatting".
All reformatting is massaging. If you are not doing a bit-for- bit
copy, you are massaging the file.

You use strange definitions, but OK.

It was a term used my biz which was hard/software development.

By your definition they'd be useless:

And now for the scores: 7 to 3; 2 to 1; and 21 to 7.

There's your "raw data", but it's not all that useful.
That is not raw data. You've typed it in and its format is ASCII.
You seem to be straining at gnats here. When does the mercury position
become "raw data" to you? When does it stop being "raw data"?
Raw data is the original collection of facts. Prettying numbers up to
be displayed on a TTY screen or hardcopy paper requires massaging if
those bits are stored on computer gear.

I didn't see an answer to my question there. At what point does the
value representing the position of the Hg meniscus become "raw data"?

When it is recorded the first time.

"[O]riginal collection of facts" is a bit ambiguous. Is it when the
observer reads it, when he initially writes it down on a form, when he
keys it into a computer memory, when he saves it to permanent media, when
a herdcopy is printed...? If you're going to make up definitions, you
at least need to be specific and consistent.

RAw data depends on when, where and how the fact is collected. It is
as varied as the subjects. Data can be recorded with pen and paper
in a bound notebook. It can be collected with an analog device.
It can be collected with a digital device. It can be things in boxes,
scribbles on paper, holes in cards, bits on magnetic tape, bits on
disks, DECtapes, cassettes, CDs, or pictures. (I'm missing some..
oh, ticks on stone or in the sand).

I define "raw data" as any copy of the original reading that carries
exactly the same information as the original reading, no matter what
format it's in.

I would not. A binary datum, 111100111, is not the same as the
number, 747, displayed in ASCII format on your TTY screen.

If any information has changed, it's no longer raw
data. If the information is the same, but the data has been reformatted,
labeled, columnized, "prettied up", sorted, or any other information
preserving transformation, it's still raw data,

We never called that raw data.

since the information is
unchanged.

The data has been processed through some code which changed the format
it is stored in. It is no longer raw; raw implied no changed have been
made. Any reformatting requires changes. If any of the reformatting
code over time has any bug, (say one that sets a bit which isn't
detected), the outcome of analyses decades later would be affected.

Do you see any problem with that?

Oh, yes. :-) Numbers are an especial problem. think of data
storages that varied from 8 bits/word to 72/word over three
decades. And now things are measured in "bytes" which vary
with the phase of the sun and the setting of the moon.

If you want to call the verification and formatting "massaging",
fine, but if it's not done, the data is unusable.

Exactly. It's unusable to most people except those who run code to
use it as input (which is what scientists do).
And many others who might be seriously interested.
This thread has been talking about non-scientists having access to any
data which was collected; further constraints were declared that the
data had to be prettied up and completely described so that anybody
could access the data and know what it meant.

That would be what you were talking about, not me. All I insisted was
that the data be usable, which I think you are calling "prettied up".

I think it was Eric who wanted stuff made public in an attempt to
prevent what happened with this global warming fiasco the politicians
have been milking for oodles of money.

That should be a requirement for any data used to support a paper. If
the data is not in usable form, how could the research be done? It looks
like that may be one of the current problems with "climate science". The
data they were using was/is not in usable form, but they didn't let that
stop them.

I don't know what happened in the Anglica place. A lot of backtracing
of the data and political discussions and their timelines has to be
done. It is probably impossible because the UN is involved.

One of you made a further
requirement that the scientist do all that work. Ptui.

No, that's for grad students. ;-) But somebody has to do it, or the
research is based on invalid assumptions.

grin But we're talking about data that is not garnered by your grad
student over the last year. Verification of that kind of data is
managable...most of the time ;-).

That should be one of the
deliverables in the data collection contract.

You don't know what you're talking about.
And you're assuming facts not in evidence.
Actually, I'm not assuming anything. I'm talking about moving bits
and presenting them to non-expert readers. I know a lot about this
kind of thing because I did that kind of work for 25 years.

It's you that's worried about "non-expert" readers, not me. I just
want it accessible in a usable form. You don't need to sugar coat
it.

Your kind of usable form requires the raw data to be massaged before
storing it on a public forum.
I guess that depends on your definition of "massaging". As long as it
doesn't corrupt the data, I don't care what you call it, but the
simpler, the better.
You can't tell if the data's been corrupted if it's been reformatted.
You have to have QA specialist checking.

[this is a thread drift alert]

Shouldn't that be a routine procedure?

By whom? If the data you're using was collected by Leonardo, QA is
a tad problematic.

Or do you expect to use invalid
data to get valid results?

Think of the log tables which were produced and printed. If there is
one typo, and somebody used that number, to record a data set.
Now get in your time machine and come back to today. The data set
may be used a input for a lot of analyses today.
Now answer your question. My answer would be yes; at some point
you have to use what is available.

These are aspects of bit recordings I've been trying to solve
for decades. All of my work was involved with shipping code
to customers. All of this discussion reminds me of the work
I did. There are CATCH-22s, deadly embraces, and impossibilities
which is caused by working with data which is invisible to human
eye.

I'm definitely not talking about a contract.

Then who's paying for it? If it's not taxpayers, then I really
don't care how it's done. If it is from taxes, then there better
be an enforceable contract in place, or we'll be right back where
we are now.

Contract law is different in each and every country.

So? There are still enforceable contracts. How would you do
international business without them?

You sign a contract for each country or entity in which you want to do
business.
Exactly. Why were you trying to make an issue of such an obvious
point?
You're the one who started to talk about contracts.

Which taxpayers do you think paid for the gathering of that data?
Who pays for the data the maritime business provides?

Don't know, don't care. Are you saying the IPCC is not tax-funded?

Where did our $50B go, then? I think grants are generally in the
form of contracts.

You don't even know how things get done.

Again, you might be surprised.

Not at all. You have no idea how much work is involved.

We paid for a lot of work that now appears useless.

It is useless because everybody seems to have depended on one, and
only one, entity for their sources. That is a bloody procedural
problem in the science biz. There aren't independent sources nor
studies being used by the politicians nor the UN nor, science
conclusions. With the advent of the thingie called the WWW,
the myths become the facts at light speed.

I'd rather pay for
careful work done in an open, transparent manner. It's cheaper than
having to redo it.

But that open, transparent manner is expensive, difficult, and
impossible (unless you develop a time machine) in some cases.
Storing data is not a trivial whether it's public or private.

Take a look at all the problems the open source biz has for
computer code. That's an "open, transparent manner" is
not a trivial endeavour.

/BAH

Bill Ward[_2_]

On Mon, 14 Dec 2009 09:57:19 -0500, jmfbahciv wrote:

Bill Ward wrote:
On Sat, 12 Dec 2009 09:53:43 -0500, jmfbahciv wrote:

Bill Ward wrote:
On Fri, 11 Dec 2009 09:03:37 -0500, jmfbahciv wrote:

Bill Ward wrote:
On Thu, 10 Dec 2009 08:01:19 -0500, jmfbahciv wrote:

snip clean my tty screen

I think that's your definition. I said,"I'm talking about
unadjusted digital versions of the 'raw data'", and you took issue
with it.

No, didn't talk about that. You want prettied up and reformatted so
anybody can read it and understand what it is. That takes code and
massages the raw data.
"Adjusting", or "massaging" is different from "reformatting".
All reformatting is massaging. If you are not doing a bit-for- bit
copy, you are massaging the file.

You use strange definitions, but OK.

It was a term used my biz which was hard/software development.

By your definition they'd be useless:

And now for the scores: 7 to 3; 2 to 1; and 21 to 7.

There's your "raw data", but it's not all that useful.
That is not raw data. You've typed it in and its format is ASCII.
You seem to be straining at gnats here. When does the mercury
position become "raw data" to you? When does it stop being "raw
data"?
Raw data is the original collection of facts. Prettying numbers up to
be displayed on a TTY screen or hardcopy paper requires massaging if
those bits are stored on computer gear.

I didn't see an answer to my question there. At what point does the
value representing the position of the Hg meniscus become "raw data"?

When it is recorded the first time.

"[O]riginal collection of facts" is a bit ambiguous. Is it when the
observer reads it, when he initially writes it down on a form, when he
keys it into a computer memory, when he saves it to permanent media,
when a herdcopy is printed...? If you're going to make up
definitions, you at least need to be specific and consistent.

RAw data depends on when, where and how the fact is collected. It is as
varied as the subjects. Data can be recorded with pen and paper in a
bound notebook. It can be collected with an analog device. It can be
collected with a digital device. It can be things in boxes, scribbles on
paper, holes in cards, bits on magnetic tape, bits on disks, DECtapes,
cassettes, CDs, or pictures. (I'm missing some.. oh, ticks on stone or
in the sand).

I define "raw data" as any copy of the original reading that carries
exactly the same information as the original reading, no matter what
format it's in.

I would not. A binary datum, 111100111, is not the same as the number,
747, displayed in ASCII format on your TTY screen.

But carries the same, unadjusted information.

If any information has changed, it's no longer raw data. If the
information is the same, but the data has been reformatted, labeled,
columnized, "prettied up", sorted, or any other information preserving
transformation, it's still raw data,

We never called that raw data.

OK, what do you want to call it? I'm easy.

since the information is
unchanged.

The data has been processed through some code which changed the format
it is stored in. It is no longer raw; raw implied no changed have been
made. Any reformatting requires changes. If any of the reformatting
code over time has any bug, (say one that sets a bit which isn't
detected), the outcome of analyses decades later would be affected.

I agree if the information is changed the data is no longer raw data. I
would call it corrupted data. What do you want to call media that carry
exactly the same information as the raw data but in a different format?

I would call it copies of the raw data, but you seem to prefer some other
unspecified term.

Do you see any problem with that?

Oh, yes. :-) Numbers are an especial problem. think of data storages
that varied from 8 bits/word to 72/word over three decades. And now
things are measured in "bytes" which vary with the phase of the sun and
the setting of the moon.

You seem to be focusing on the problems in ensuring the data is
transcribed properly into digital form. I'm not disagreeing with that,
I'm just saying no matter who uses the data, it must be transcribed into
a usable format. If researchers are cutting corners on data integrity,
posting it on line would be one way to stop that. If they are doing it
right, then there should be no problems in making it available on line.

If you want to call the verification and formatting "massaging",
fine, but if it's not done, the data is unusable.

Exactly. It's unusable to most people except those who run code to
use it as input (which is what scientists do).

And many others who might be seriously interested.

This thread has been talking about non-scientists having access to any
data which was collected; further constraints were declared that the
data had to be prettied up and completely described so that anybody
could access the data and know what it meant.

That would be what you were talking about, not me. All I insisted was
that the data be usable, which I think you are calling "prettied up".

I think it was Eric who wanted stuff made public in an attempt to
prevent what happened with this global warming fiasco the politicians
have been milking for oodles of money.

Who wouldn't?

That should be a requirement for any data used to support a paper. If
the data is not in usable form, how could the research be done? It
looks like that may be one of the current problems with "climate
science". The data they were using was/is not in usable form, but they
didn't let that stop them.

I don't know what happened in the Anglica place. A lot of backtracing
of the data and political discussions and their timelines has to be
done. It is probably impossible because the UN is involved.

One of you made a further
requirement that the scientist do all that work. Ptui.

No, that's for grad students. ;-) But somebody has to do it, or the
research is based on invalid assumptions.

grin But we're talking about data that is not garnered by your grad
student over the last year. Verification of that kind of data is
managable...most of the time ;-).

That should be one of the
deliverables in the data collection contract.

You don't know what you're talking about.
And you're assuming facts not in evidence.
Actually, I'm not assuming anything. I'm talking about moving
bits and presenting them to non-expert readers. I know a lot about
this kind of thing because I did that kind of work for 25 years.

It's you that's worried about "non-expert" readers, not me. I just
want it accessible in a usable form. You don't need to sugar coat
it.

Your kind of usable form requires the raw data to be massaged before
storing it on a public forum.
I guess that depends on your definition of "massaging". As long as
it doesn't corrupt the data, I don't care what you call it, but the
simpler, the better.

You can't tell if the data's been corrupted if it's been reformatted.
You have to have QA specialist checking.

[this is a thread drift alert]

Shouldn't that be a routine procedure?

By whom? If the data you're using was collected by Leonardo, QA is a
tad problematic.

Or do you expect to use invalid
data to get valid results?

Think of the log tables which were produced and printed. If there is
one typo, and somebody used that number, to record a data set.

Why would you use a log table to record data? Logs would be used for
some sort of transformation, not raw data, and, unless you have an old
Pentium, not really an issue today.

Now get
in your time machine and come back to today. The data set may be used a
input for a lot of analyses today. Now answer your question. My answer
would be yes; at some point you have to use what is available.

Then it would appear as an instrumental error, either as an outlier, or
buried in the noise.

These are aspects of bit recordings I've been trying to solve for
decades. All of my work was involved with shipping code to customers.
All of this discussion reminds me of the work I did. There are
CATCH-22s, deadly embraces, and impossibilities which is caused by
working with data which is invisible to human eye.

It sounds like you may be too close to be objective.

I'm definitely not talking about a contract.

Then who's paying for it? If it's not taxpayers, then I really
don't care how it's done. If it is from taxes, then there better
be an enforceable contract in place, or we'll be right back where
we are now.

Contract law is different in each and every country.

So? There are still enforceable contracts. How would you do
international business without them?

You sign a contract for each country or entity in which you want to
do business.
Exactly. Why were you trying to make an issue of such an obvious
point?
You're the one who started to talk about contracts.

Which taxpayers do you think paid for the gathering of that data?
Who pays for the data the maritime business provides?

Don't know, don't care. Are you saying the IPCC is not tax-funded?

Where did our $50B go, then? I think grants are generally in the
form of contracts.

You don't even know how things get done.

Again, you might be surprised.

Not at all. You have no idea how much work is involved.

We paid for a lot of work that now appears useless.

It is useless because everybody seems to have depended on one, and only
one, entity for their sources. That is a bloody procedural problem in
the science biz. There aren't independent sources nor studies being
used by the politicians nor the UN nor, science conclusions. With the
advent of the thingie called the WWW, the myths become the facts at
light speed.

Exactly. Researchers might be a little more careful if they know someone
else is watching. In fact, I'd say the way they treated Steve M is
proof positive they would.

I'd rather pay for
careful work done in an open, transparent manner. It's cheaper than
having to redo it.

But that open, transparent manner is expensive, difficult, and
impossible (unless you develop a time machine) in some cases. Storing
data is not a trivial whether it's public or private.

Take a look at all the problems the open source biz has for computer
code. That's an "open, transparent manner" is not a trivial
endeavour.

I didn't say it would be easy, just necessary, if we're going to get any
valid results from the clown brigade.

I M @ good guy

On Mon, 14 Dec 2009 09:16:50 -0500, jmfbahciv jmfbahciv@aol wrote:

I M @ good guy wrote:
On Sat, 12 Dec 2009 09:53:43 -0500, jmfbahciv jmfbahciv@aol wrote:

snip clean off the tty

Which taxpayers do you think paid for the gathering of that data? Who
pays for the data the maritime business provides?
Don't know, don't care. Are you saying the IPCC is not tax-funded?

Where did our $50B go, then? I think grants are generally in the form
of contracts.

You don't even know how things get done.
Again, you might be surprised.

Not at all. You have no idea how much work is involved.

/BAH

You put way too much into what scientists
do and how little a non-professional can do and
understands.

Nothing has to be done to the data, just
make it available according to the law, and let
the recipients worry about the format.

That isn't what somebody insisted be done. I've been talking
about the suggestion that the data be prettied up and
documented with an explanation of the conclusions so that
a two-year-old can understand it. That last one is impossible
when the lab project is still in hypothesis-mode.

It isn't just the data that is subject to the
FOIA, it is the whole ball of wax that public
money paid for, professional work is supposed
to be notated, even within the text of papers,
hiding anything is either hiding something,
or some kind of perversion about importance.

The hiding is not the problem. The problem is politicians
using this as a basis for passing laws, tweaking economies,
stopping trade, and destroying nations and infrastructures.
another problem is a public who would rather believe in
conspiracies, unicorns, and outrageous fictions than
expend a tad of mental energy thinking.

If you want to solve this "cheating" problem, then solve
those two.

/BAH

I don't know what you mean by cheating, because
I can't believe that a professional would benefit from it.

Just following the law and complying with
FOIA requests should be enough.

jmfbahciv

Bill Ward wrote:
On Mon, 14 Dec 2009 09:57:19 -0500, jmfbahciv wrote:

Bill Ward wrote:
On Sat, 12 Dec 2009 09:53:43 -0500, jmfbahciv wrote:

Bill Ward wrote:
On Fri, 11 Dec 2009 09:03:37 -0500, jmfbahciv wrote:

Bill Ward wrote:
On Thu, 10 Dec 2009 08:01:19 -0500, jmfbahciv wrote:

snip clean my tty screen

I think that's your definition. I said,"I'm talking about
unadjusted digital versions of the 'raw data'", and you took issue
with it.
No, didn't talk about that. You want prettied up and reformatted so
anybody can read it and understand what it is. That takes code and
massages the raw data.
"Adjusting", or "massaging" is different from "reformatting".
All reformatting is massaging. If you are not doing a bit-for- bit
copy, you are massaging the file.
You use strange definitions, but OK.
It was a term used my biz which was hard/software development.

By your definition they'd be useless:

And now for the scores: 7 to 3; 2 to 1; and 21 to 7.

There's your "raw data", but it's not all that useful.
That is not raw data. You've typed it in and its format is ASCII.
You seem to be straining at gnats here. When does the mercury
position become "raw data" to you? When does it stop being "raw
data"?
Raw data is the original collection of facts. Prettying numbers up to
be displayed on a TTY screen or hardcopy paper requires massaging if
those bits are stored on computer gear.
I didn't see an answer to my question there. At what point does the
value representing the position of the Hg meniscus become "raw data"?
When it is recorded the first time.

"[O]riginal collection of facts" is a bit ambiguous. Is it when the
observer reads it, when he initially writes it down on a form, when he
keys it into a computer memory, when he saves it to permanent media,
when a herdcopy is printed...? If you're going to make up
definitions, you at least need to be specific and consistent.
RAw data depends on when, where and how the fact is collected. It is as
varied as the subjects. Data can be recorded with pen and paper in a
bound notebook. It can be collected with an analog device. It can be
collected with a digital device. It can be things in boxes, scribbles on
paper, holes in cards, bits on magnetic tape, bits on disks, DECtapes,
cassettes, CDs, or pictures. (I'm missing some.. oh, ticks on stone or
in the sand).

I define "raw data" as any copy of the original reading that carries
exactly the same information as the original reading, no matter what
format it's in.
I would not. A binary datum, 111100111, is not the same as the number,
747, displayed in ASCII format on your TTY screen.

But carries the same, unadjusted information.

But the transformed bits are not the raw data. If there's been any
kind of error during the transforming or the second set of bits
gets hit with a cosmic ray, you can always go back to the raw data.
That's why raw data is kept raw. It's a sanity check.

If any information has changed, it's no longer raw data. If the
information is the same, but the data has been reformatted, labeled,
columnized, "prettied up", sorted, or any other information preserving
transformation, it's still raw data,
We never called that raw data.

OK, what do you want to call it? I'm easy.

Converted. copied. The set of data which has not been touched
to provide a sanity check in case the set you are working with
has an error.

since the information is
unchanged.

The data has been processed through some code which changed the format
it is stored in. It is no longer raw; raw implied no changed have been
made. Any reformatting requires changes. If any of the reformatting
code over time has any bug, (say one that sets a bit which isn't
detected), the outcome of analyses decades later would be affected.

I agree if the information is changed the data is no longer raw data. I
would call it corrupted data. What do you want to call media that carry
exactly the same information as the raw data but in a different format?

Reformatted. That implies the raw data has been massaged into a
different format. This massaging happens all the time depending
on the the usage and computer gear being used.

I would call it copies of the raw data, but you seem to prefer some other
unspecified term.

That's a much better phrase than insisting it's the raw data.

Do you see any problem with that?
Oh, yes. :-) Numbers are an especial problem. think of data storages
that varied from 8 bits/word to 72/word over three decades. And now
things are measured in "bytes" which vary with the phase of the sun and
the setting of the moon.

You seem to be focusing on the problems in ensuring the data is
transcribed properly into digital form.

Yup. The suggestion was to make the raw data available to the public.
There are problems with that and also takes a lot of manpower.

I'm not disagreeing with that,
I'm just saying no matter who uses the data, it must be transcribed into
a usable format.

Then it is not the raw data. The suggestion was to provide the raw
data. That means that the original collection of bits has to be
copied bit for bit with no modification. A lot of copy operations
insert 0 zero bits for alignment.

If researchers are cutting corners on data integrity,
posting it on line would be one way to stop that. If they are doing it
right, then there should be no problems in making it available on line.

And I've been trying to tell you that it takes lots of time, money,
and babysitting to make that available. If you pass a rule that
all data has to be made public, resources, which would have normally
been used for the real research, will have to be used for babysitting
those bits. The real, science work will not get done.

If you want to call the verification and formatting "massaging",
fine, but if it's not done, the data is unusable.

Exactly. It's unusable to most people except those who run code to
use it as input (which is what scientists do).

And many others who might be seriously interested.

This thread has been talking about non-scientists having access to any
data which was collected; further constraints were declared that the
data had to be prettied up and completely described so that anybody
could access the data and know what it meant.
That would be what you were talking about, not me. All I insisted was
that the data be usable, which I think you are calling "prettied up".
I think it was Eric who wanted stuff made public in an attempt to
prevent what happened with this global warming fiasco the politicians
have been milking for oodles of money.

Who wouldn't?

Sigh! Have I been wasting my time? Strawman.

snip

That should be one of the
deliverables in the data collection contract.

You don't know what you're talking about.
And you're assuming facts not in evidence.
Actually, I'm not assuming anything. I'm talking about moving
bits and presenting them to non-expert readers. I know a lot about
this kind of thing because I did that kind of work for 25 years.
It's you that's worried about "non-expert" readers, not me. I just
want it accessible in a usable form. You don't need to sugar coat
it.
Your kind of usable form requires the raw data to be massaged before
storing it on a public forum.
I guess that depends on your definition of "massaging". As long as
it doesn't corrupt the data, I don't care what you call it, but the
simpler, the better.

You can't tell if the data's been corrupted if it's been reformatted.
You have to have QA specialist checking.

[this is a thread drift alert]

Shouldn't that be a routine procedure?
By whom? If the data you're using was collected by Leonardo, QA is a
tad problematic.

Or do you expect to use invalid
data to get valid results?

Think of the log tables which were produced and printed. If there is
one typo, and somebody used that number, to record a data set.

Why would you use a log table to record data? Logs would be used for
some sort of transformation, not raw data, and, unless you have an old
Pentium, not really an issue today.

You are wasting my time. This is a very serious matter and requires
a lot of thinking, discussion, and consideration. Your thinking
style is extremely short-term.

Now get
in your time machine and come back to today. The data set may be used a
input for a lot of analyses today. Now answer your question. My answer
would be yes; at some point you have to use what is available.

Then it would appear as an instrumental error, either as an outlier, or
buried in the noise.

These are aspects of bit recordings I've been trying to solve for
decades. All of my work was involved with shipping code to customers.
All of this discussion reminds me of the work I did. There are
CATCH-22s, deadly embraces, and impossibilities which is caused by
working with data which is invisible to human eye.

It sounds like you may be too close to be objective.

I have no idea what you mean. I know a lot about bit integrity
and shipping it. I also know how much work is required.

I'm definitely not talking about a contract.

Then who's paying for it? If it's not taxpayers, then I really
don't care how it's done. If it is from taxes, then there better
be an enforceable contract in place, or we'll be right back where
we are now.

Contract law is different in each and every country.
So? There are still enforceable contracts. How would you do
international business without them?
You sign a contract for each country or entity in which you want to
do business.
Exactly. Why were you trying to make an issue of such an obvious
point?
You're the one who started to talk about contracts.

Which taxpayers do you think paid for the gathering of that data?
Who pays for the data the maritime business provides?
Don't know, don't care. Are you saying the IPCC is not tax-funded?

Where did our $50B go, then? I think grants are generally in the
form of contracts.

You don't even know how things get done.

Again, you might be surprised.
Not at all. You have no idea how much work is involved.
We paid for a lot of work that now appears useless.
It is useless because everybody seems to have depended on one, and only
one, entity for their sources. That is a bloody procedural problem in
the science biz. There aren't independent sources nor studies being
used by the politicians nor the UN nor, science conclusions. With the
advent of the thingie called the WWW, the myths become the facts at
light speed.

Exactly. Researchers might be a little more careful if they know someone
else is watching. In fact, I'd say the way they treated Steve M is
proof positive they would.

Huh?

I'd rather pay for
careful work done in an open, transparent manner. It's cheaper than
having to redo it.

But that open, transparent manner is expensive, difficult, and
impossible (unless you develop a time machine) in some cases. Storing
data is not a trivial whether it's public or private.

Take a look at all the problems the open source biz has for computer
code. That's an "open, transparent manner" is not a trivial
endeavour.

I didn't say it would be easy, just necessary, if we're going to get any
valid results from the clown brigade.

It isn't necessary. You're just trying to fix a symptom which will
not fix the real problem but hide the real problem and prevent
work from getting done.

/BAH

Thread Tools	Search this Thread
Show Printable Version	Search this Thread: Advanced Search
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Rain finally arrives in S.Essex due to a tried and tested predictionmethod.	Dave Cornwell[_4_]	uk.sci.weather (UK Weather)	3	August 13th 15 05:04 PM
Ancient climate records 'back predictions' Climate sensitivitysimilar in past warmings	Dawlish	uk.sci.weather (UK Weather)	0	February 5th 15 01:27 PM
Models may be Overestimating Global Warming Predictions	David[_4_]	sci.geo.meteorology (Meteorology)	5	November 21st 08 09:11 PM
Weather Eye: Old-timers' tales tell story of global warming -- Climate change observations from a professional observer.	Psalm 110	sci.geo.meteorology (Meteorology)	0	August 23rd 04 06:53 AM
Rubber Duckies Can Save The World ..... Can Solve Global Warming or Cooling	KCC	alt.talk.weather (General Weather Talk)	2	January 19th 04 12:12 PM

Menu

About Us