“Simran, what did you do with the moon?”
Though he couldn’t see her across the cubicle walls, Nico’s voice carried, and it was rare that his yelling wasn’t met by an equally mock-petulant answer.
“Which one, Nico?”
“I didn’t do anything to the moon. Do I look like I did something to the moon?”
“I don’t know, I can’t see you. Come over here, look me in the eye, and tell me you didn’t do anything to the moon.”
– ❉ –
“OK, I might have done something to the moon. Just not consciously?”
“Be serious, Simran. Yeah, there might be something wrong in the data from the telescope, but what if it’s real? This could be a really interesting result…”
“I’ll search around in the literature.”
– ❉ –
A few days later, Simran walked into Nico’s cubicle with excitement writ large on her face. Nodding to Dr. K., who had just finished pointing something out, she started to talk a mile-a-minute:
“You might have been right, Nico: a few unusual disappearances like this have been observed before, but never with dense enough data to say anything about them. But we have that kind of data! We could make something of this—”
Nico interrupted her, looking rather sheepish.
“Simran — remember that preprocessing script I got from the old postdoc?”
“Yeah. We’re using a stride of 100 timesteps since the analysis is so computationally expensive, but I guess he never used it with more than ten frames (I guess that was back before we got the XSEDE hours)… the output files are named with whatever the first three digits of the input timestamp are. I had the script start printing the file names:”
dat-000056.hdf5 --> filtered-56..hdf5 ... dat-009756.hdf5 --> filtered-975.hdf5 dat-009856.hdf5 --> filtered-985.hdf5 dat-009956.hdf5 --> filtered-995.hdf5 dat-010056.hdf5 --> filtered-100.hdf5
“Yup. The data was just out of order in time because of the bug, so of course that moon disappears one frame to the next.”
“Niiiiico this is so embarrassing,” Simran sighed. “I’ve spent the last two days scouring the literature. I even called the telescope techs to check for malfunctions on those days. Twice. Ugh.”
Dr. K. got out of her chair and gave Simran an almost conciliatory pat.
“You’ll dry off eventually.”
“You fell in the well!”
Nico turned back to his computer and began to type, and an astute observer might have heard him mutter “here we go.”
Dr. Kaxiros cleared her throat and began:
“A long time ago there was a troop of monkeys who, under cover of night and with only the light of the moon to guide them, went to raid the fruits of a bountiful orchard. But, all of a sudden, the Moon disappeared behind the clouds, throwing the landscape into darkness and the monkeys into panic.
”‘Where is the Moon?’ they cried, ‘She is gone!’
“As they searched about, looking to the ground for where she had fallen, the monkeys came to a well, and in its waters they saw the Moon reflected.
”‘We must rescue the Moon!’ they cried to one another. So, each grasping the hand of his neighbor, the monkeys formed a chain, lowering themselves into the well. But then the branch of the tree they were hanging from snapped, and they all fell into the well.
“After grousing about and generally being soggy, one of the monkeys looked up and noticed that the Moon was still exactly where they had first left her.”
There was a pause.
“Ooh ooh aah aah,” Simran deadpanned.
Background: Things like this really happen. The specialized nature of scientific data analysis and the constraints of budget and time under which it is conducted often mean that the tools used are mostly held together with the digital equivalent of decade-old duct tape. There’s nothing wrong with that, but it sometimes means that adapting tools to new use cases, or even just new computer systems, can present strange and subtle issues.
XSEDE is a real supercomputing center, administered by the NSF, and
.hdf5 is the extension for the HDF format, an emerging standard for scientific data storage. (In some fields: in materials science, we still use a bunch of undocumented plain-text formats with little rhyme or reason that no one has changed since the 90’s. They have not improved with age.)
Naming the star with “Gliese” is a reference to the Gliese star catalogue, one of the first naming frameworks for stars.
Disclaimer: In case you didn’t read the intro: I am not an astrophysicist, and most of this astrophysics is made up. These stories are allegories for general classes of failings and issues of rigor in science. These sorts of silly issues like the truncated timestamp might seem made up, but I’ve probably spent multiple total months of my life tracing down these sorts of bugs. Just because the cause is simple doesn’t mean the effects are straightforward — remember Story 2?
The Moral: Sometimes things are just mistakes. Validate, verify, sanity-check, and double-check — always. And even though validation becomes harder and harder the more complicated a system becomes, it also becomes more important, because there are more places for small issues to hide.
Image Source: Wikipedia.
Story Source: The Indian story, “The Moon in the Well,” with text from Indian Fables and Folklore by Shovona Devi, via the course UnTextbook.