Showing posts with label Science. Show all posts
Showing posts with label Science. Show all posts

Friday, October 5, 2012

How 4.5 Petabytes can be compressed to 42 Kilobytes



Theoretical part by I_Wont_Draw_That

Compression comes down to information theory, which is the branch of mathematics and computer science devoted to describing the amount of information in some string of characters. The attack involves compressing a string which is extremely long, but also contains an extremely low amount of information.
Consider the string: abcdefghijklmnopqrstuvwxyz
That is 26 characters. However, the amount of information stored in it may be lower. I could instead represent it as "the english alphabet", which is only 20 characters, and you still know what I mean. Given enough context, I could represent it as "alphabet". And given a shared understand that "0 means abcdefghijklmnopqrstuvwxyz", I could represent it as 0.
Now consider this string: aaabbbaaabbbaaabbbaaabbbaaabbbaaabbbaaabbbaaabbbaaabbbaaabbbaaabbbaaabbb
That's 72 characters. But what if I gave you something like: 
0=aaa 1=bbb 010101010101010101010101
That's only 36 characters. Or what if I compressed it more? 
0=aaabbb 000000000000
That's only 21 characters. So we compressed our input from 72 characters to 21 characters without losing any information. This is effectively what zipping a file does. It builds a dictionary of common patterns, aliasing them to shorter strings, and then uses the aliases in their place.
The fewer unique substrings there are, the more compressible the data is, because the dictionary can be smaller, so each alias can be shorter. What happens, then, if the entire input is one pattern repeated many, many times?
For instance, suppose the original string had been 0 repeated a trillion times. To write that string out completely would require 1 terabyte (1 byte per "0" times a trillion of them). But as you just saw, I can easily represent it just as well as "0 repeated a trillion times", which is much, much shorter. That's basically what's happening here. The original content is extremely large, but equally simple, so it compresses into almost nothing. When inflated, it's gigantic.
This extreme runs the other way, as well. For any given compression algorithm, there are inputs which cannot be compressed at all.

Practical part by Rohaq

Basic zip bombs are pretty easy to make.
Create a massive file, let's say, a gig in size, which is full of zeroes, then zip it up. Because the content of the file is uniformly repeated throughout, it compresses very easily:
$ dd if=/dev/zero bs=1024 count=1000000 | zip zipbomb1.zip -
  adding: -1000000+0 records in
1000000+0 records out
1024000000 bytes (1.0 GB) copied, 9.97309 s, 103 MB/s
 (deflated 100%)
$ ls -lh zipbomb1.zip
-rw-r--r-- 1 me me 971K 2012-08-01 18:21 zipbomb1.zip
(The above is under Linux, and pushes 1024*1000000 '0' characters into a zip file with standard compression)
Copy that zip file ten times over. Then add all of these zip files into a single zip file. Because the file content across each zip file is exactly the same, again, this compresses very well:
$ zip -9 zipbomb-lvl2-1.zip zipbomb*
  adding: zipbomb10.zip (deflated 100%)
  adding: zipbomb1.zip (deflated 100%)
  adding: zipbomb2.zip (deflated 100%)
  adding: zipbomb3.zip (deflated 100%)
  adding: zipbomb4.zip (deflated 100%)
  adding: zipbomb5.zip (deflated 100%)
  adding: zipbomb6.zip (deflated 100%)
  adding: zipbomb7.zip (deflated 100%)
  adding: zipbomb8.zip (deflated 100%)
  adding: zipbomb9.zip (deflated 100%)
$ ls -lh zipbomb-lvl2-1.zip 
-rw-r--r-- 1 me me 28K 2012-08-01 18:26 zipbomb-lvl2-1.zip
(The above adds all of the copied zip files into the zip file zipbomb-lvl2-1.zip with the highest level of compression)
Now copy that zip file ten times over, and zip it them all up again. Rinse and repeat, let's say 10 layers deep.
So following from the compression basics people have been mentioning, ignoring individual file headers, etc. the above could be compressed as something as simple as:
[[[[[[[[[[[0]{1024000000}]{10}]{10}]{10}]{10}]{10}]{10}]{10}]{10}]{10}]{10}
Now a virus scanner comes along, and then attempts to scan the zip file. It decompresses the first set of zips, then decompresses each of those, then decompresses each of those. Eventually it gets to the lowest layer, and attempts to decompress these files into memory. At this point you're attempting to decompress 1010 1GB files into memory, so unless you have about 9.3 exabytes of RAM at hand, you're in trouble, and since some scanners automatically scan new files, well, you could be in trouble as soon as you receive or open the file.
Scanners nowadays generally have checks in place to make sure that they're not affected by zip bombs, however, which is probably why MSE is no longer detecting it as a threat.

Thursday, October 4, 2012

The Four Forces of Nature


There are four "fundamental interactions" -- these are the four very basic types of forces that affect particles. They are the strong interaction, the weak interaction, the electromagnetic interaction, and the gravitational interaction.
Electromagnetic: we're most familiar with this interaction, and it has the most direct effect on our day to day lives. It is very, very strong -- many orders of magnitude stronger than gravity. The EM interaction dictates all of chemistry. If you've ever picked something up, or felt friction, or drank water, oranything that has nothing to do with radiation, nuclear forces, or gravity, then it's dictated by the electromagnetic interaction. The study of the electromagnetic interaction at the quantum level is called QED: [1] Quantum Electrodynamics, and is mediated by the photon. Richard Feynman made a lot of progress here.
Strong Interaction: if we look closely at the nucleus of an atom, we'll find that the strong interaction shows up in two places: it holds protons and neutrons together inside the nucleus, and it also holds quarks together to form protons and neutrons and other hadrons. The strong interaction is even stronger than EM--but its effects fall off very quickly with distance so we don't really experience it at the macroscopic scale. We discovered the strong interaction because we couldn't figure out how EM could hold things together inside the nucleus. The study of the strong interaction is called [2] Quantum Chromodynamics, and is very interesting.
Weak Interaction: This one dictates radioactive decay; the forces are mediated by the W and Z bosons.
Gravity: gravity is very, very weak -- many orders of magnitude weaker than the strong force. We don't see gravity at human scales; it only appears at galactic sizes (planets, stars, etc). Because it's so weak, it's exceedingly hard to study. When looking at subatomic particles, the EM and Strong forces are so much more powerful than gravity that it's nearly impossible to see the effects of gravity at a small scale. Because of gravity's weakness, we have not been able to study it closely at the quantum level. Gravity is "split off" because it's too weak to study at a quantum scale. It's hard to see and it's hard to study. Perhaps if we understood more of its characteristics at the quantum scale we'd get some more hints about how to reconcile the maths.
Now it turns out that some very smart people discovered that Electromagnetism and the Weak interaction are actually two aspects of a single interaction which we call "the electroweak". Electromagnetism and radioactive decay are therefore two facets of one "parent" interaction -- leaving us with only 3 fundamental interactions! We also have strong evidence to suspect that the Strong interaction can be combined with the Electroweak interaction, and I think we've made progress there, but I'm not up to date on this.
So there's evidence that the Strong, Weak, and EM interactions can be combined into one. Given that, whywouldn't we be able to bring gravity into the mix? We should be able to unify the four into one big theory, and show each one as a different facet of the "unified field theory". The main problem is that we don't understand gravity as much as we'd like to, because it's too weak to study. We haven't figured out the math yet -- because with our current understanding of gravity, the math doesn't work out correctly. If we could more accurately characterize gravity (perhaps there's something that's too small to see yet), our understand of gravity might change slightly and we'd be able to fit it in with the others.
Slight clarification: We don't understand quantumgravity as much as we'd like to. General relativity, however, gives us an excellent framework for macroscopic gravity. Our main issue is using what we know from relativity in conjunction with quantum physics. Einstein's relativity works so well that it's hard to imagine describing gravity any other way; this is what I mean when I say "we don't understand quantum gravity well enough"--we understand gravity excellently, but we don't understand it at the quantum level.

Tuesday, October 2, 2012

Why don't hair cells heal themselves like cuts and scrapes do? Will we have solutions to this soon?



I work on the development of neurosensory cells in the cochlea, with the goal being figuring out the secret to hair cell regeneration.
Mammals have lost the ability to regenerate hair cells (the types of cells that translate sound waves into a neural signal) after damage. Birds and reptiles, however, have maintained that ability, and after enduring trauma or infection, or drug-induced hair cell loss, a non-sensory supporting cell will transdifferentiate (change from one differentiated cell type to another) into a mechanosensory hair cell. Why exactly can't mammals do this? Well, we're not exactly sure. There are all sorts of inhibitory signals within the mature mammalian cochlea that prevent cell division or transdifferentiation (which is also one reason why we never see any cancer in this system; the body basically has all the proliferation completely shut off). So we try to figure out if there are ways around this apparent moratorium on proliferation/differentiation in mammalian cochleae, and if there's a way to open up the possibility of regenerating hair cells in mature mammalian cochlea.
With gene therapy or viral vectors, we have been able to grow hair cells in vitro. That's true, in fact it doesn't even take anything that complicated to grow hair cells in culture - you just need to dump atoh1 protein (the master gene for hair cell development) on some competent cells and they will turn into hair cells (they'll even recruit neighboring cells to become supporting cells). But that doesn't really help us regenerate hair cells in mature mammalian cochlea - those cells aren't really competent to respond to that signal once they're past a certain point. There's been a few studies that have succeeded in generating transdifferentiated hair cells from support cells using genetic systems to overexpress those genes that direct a hair cell fate - but this only lasts about a month after birth before you start losing that effect. And on top of that, the functionality of the hair cells that were generated was questionable. And of course, these animals were genetically engineered to have these genes turned on at certain points, this is obviously not a viable option to translate into human treatment.
So it still remains that gene therapy is probably our best shot to regenerate hair cells in a mature human cochlea. The only problem is we don't know exactly what combination of genes will do the trick on a mature cochlea. So a lot of work is done on figuring out how this happens normally, then trying to find a way to manipulate that system. Since this is my field, I could go on forever about this, but I don't want to start getting too tangential or far out, especially since I don't have time to look up sources (gotta go work on some of my mice right now) but if y'all have any questions I'll do my best to answer them when I get a chance.

Thursday, September 27, 2012

When I turn off the lights, where does all the light go?



Light is a form of energy, but when you turn the light off, the light goes away, so where does the energy go?
The short answer is: it gets absorbed by the wall as heat.
The longer answer needs a bit of a more detailed mental picture. The wall is a solid, which consists of a (fairly) regular structure of atoms. Just imagine a grid of hard spheres laying against each other. This is the surface of the wall. At absolute zero, these atoms do not move and are simply at rest, one just touching the next. Having a temperature means that the wall contains thermal energy. This thermal energy is a random motion of the atoms around their equilibrium point, they're basically vibrating. Such a vibration can travel rather far through the lattice in the form of a wave. One ball pushes the next, which pushes the next, which pushes ... etc. Such a wave is commonly called a 'phonon', because it is also the way in which sound can move through solids.
Now think of the light. Light consists of tiny particles called photons, not to be confused with the phonons in the wall. Each photon is a tiny packet of electromagnetic energy and momentum. If such a photon hits (an atom of) the wall, its energy and momentum is absorbed. Since both these quantities need to be conserved, it means the atom will get a little "kick" from absorbing the photon. It will move, and kick against its neighbor, etc etc. So basically the photon has been converted into a phonon.
If enough photons get absorbed, this will result in the wall warming up slightly. So the light gets converted into thermal energy in the wall.
It's rather analogous to a stone falling into a lake. The energy of the stone will spread out over the surface of the water in the form of waves. The water itself doesn't move much, but the waves can carry the energy quite far. Likewise, the atoms don't move much, but the energy/momentum from the photons can carry rather deep into the wall.

Wednesday, September 26, 2012

How different would the movie Jurassic Park be with today's information?



The appearance and behavior of dinosaurs is largely a factor of speculation. There are a few things that would be updated. The Velociraptors would have some sort of feathery integument, as would the baby T. rex. Maybe some of the animals would show more color than gray, brown or moss-green. But that doesn't take much thinking, and the science of paleontology hasn't been able to ascertain much about dinosaur color, unless preserved feathers are found (they have been and colors include black, white, and sort of an umbery, rusty color- I believe someone else mentions this in a post.)
Jurassic Park is now 20 years out of date. If you're looking to update the science and still retain a compelling story, you're going to end up with something like this:
The crucial part of Crichton's idea was that the amber which preserved the mosquito served as a preservative barrier- a seal which locked away the precious dinosaur blood from contaminants and harm- a simple idea which ultimately proved very compelling for a story.
Now there are definitely issues with this. You're not going to set up a lab and get extinct animal blood from a dead bug anytime soon. Plus, after sitting in a chunk of resin for millions of years there is certainly going to be some mingling between the mosquito DNA and the DNA of whatever it fed on and anything else trapped in the sap. Wouldn't it be nice to see THATcome out of an egg? Yeesh! I degress.
The one thing that people have heard about Jurassic Park if they've heard anything in the last 20 years, is that "you cannot clone dinosaurs from blood in mosquitoes trapped in amber." So how do we move away from that, bsoftut still make dinosaurs? Because no one is going to be amazed by the trapped mosquito/dino DNA idea anymore. They know it. It's part of popular culture, like "don't cross the streams" or "He's been dead the whole movie!" How do we make the core part of Jurassic Park new?
Easy.
One of the biggest developments in paleontological research in the last few decades has been the discovery of soft tissues preserved in fossil bone interiors. These bones come from the badlands, like any other dinosaur fossil, but they are excavated using sterile field techniques and without polymer consolidants (glues) to keep contaminants from entering the bone' interiors (I know this because I have done it). The fossils are then taken back to a sterile lab where the mineral components are dissolved in baths. If the dinosaur bones were truly permineralized (eg- all 'rock') then the entire fossil would basically dissolve in solution. BUT! That didn't happen when the first lab tests of this kind were conducted back in the early 2000's. There was stuff left over after the mineral components had dissolved away.Spongy, squishy, stretchy, soft stuff. Paleontologists have documented what appear to be bits of collagen (connective tissues), and remnants of blood and bone cells from those samples. There are also bits of proteins that may be preserved. This was absolutely unheard of when Crichton wrote Jurassic Park 30 years ago. Now, in the real world accessing DNA hundred million year old soft tissue is not yet viable, but in 1990, neither was sucking out a fossilized mosquito's guts. But it was brilliant science fiction. And while no one has ever actually pulled blood from a fossilized mosquito...
I'm sorry but take a moment and get ready for this realization:
WE HAVE ACTUAL HONEST-TO-GOODNESS DINOSAUR TISSUE AND CELLS. HOLY SHIT!!
What does this mean? It means that there's no more need for the old amber-bug-blood plot line! Now, instead of mining for amber in the jungle playing roulette with mosquitoes (there's no way of knowing what kind of animal a mosquito had bitten just by looking at the thing--Hammond would have had to sort through thousands of mosquitoes before finding one that had actually bitten a dinosaur), you can go to the badlands and look for soft tissue from ANY DINOSAUR YOU WANT. How's THAT for an overhaul? It completely updates the heart of Jurassic Park's story and allows it to remain a sort of beacon for trendy Sci Fi (yes, and you can have your cloning morality play too). It also removes a lot of inconsistencies, like "How did they clone extinct plants? Mosquitoes don't drink plant blood" and for scientists, it seems more plausible because if you want a park with, say, aTriceratops in it, all you have to do is go to Montana, South Dakota or Wyoming, poke around until you find some Triceratops bones poking out from a nice, thick sandstone unit, and BAM- pretty damned good chance you could get some soft tissues out of there.
The second big change for Jurassic Park would have to be the DNA gap-filling. No more amphibian DNA. Birds. They would need to use a more ancient bird, like an Emu, Cassowary, Rhea or Ostrich. These large, flightless birds (collectively known as Ratites) are some of the most primitive-looking birds living. There has been a lot of genetic work done on chickens lately, and chicken DNA might work as well because we know so much about it. In a Sci Fi story it would not be much of a stretch to say that we have control over the chicken genome, and thus could reduce it back to a sort of "stem" state, where the genetic instructions basically say to build a archosaur-like animal, and the combination of the Dinosaur DNA with the trimmed chicken genome causes the dinosaur DNA to take over and build a dinosaur.
If I had my way and could write a Jurassic Park sequel, it would go like this:
Soft tissue in fossil bones has changed paleontology. Alan Grant and co. are leaders in this area of research do to their years of field experience.
Lewis Dodson is the bad guy who never got his chance. He was instrumental in the first two books, but gets 3 minutes of screen time in the first movie. He's sinister, greedy, selfish, and cares only for profit. He has no moral scruples, other than his desire to make a profit for himself. Use him as the antagonist for the 4th movie. He's never gotten over his loss at Nedry's Hands. He never really gave up cloning dinosaurs. He sees money in them. His company has been sequencing genomes, and he has focused on birds- domestic fowl, endangered species, you name it. He spends a long time waiting. Then he hears about soft tissue preservation in fossil bones- blood cells, proteins...could there be DNA? Perhaps he is tempted to sneak out some of Grant's specimens without permission...
Point is- not only could you clone dinosaurs with the soft tissue story line, but marine reptiles, too. Giant ichthyosaurs, mososaurs, plesiosaurs...there's a lot of scary stuff in the ancient sea! For the purpose of Sci Fi, anything that's fossilized could be fair game! There's a lot of cool, extinct animals out there, people. Big, scary extinct animals...