Petabytes of data at Large Hadron Collider - Sixty Symbols
ฝัง
- เผยแพร่เมื่อ 8 ก.พ. 2025
- This question is posed on behalf of many Sixty Symbols viewers who asked about it. With thanks to David Barney and Steven Goldfarb, from CMS and ATLAS respectively. See more of our videos from the LHC at • Large Hadron Collider ...
Visit our website at www.sixtysymbol...
We're on Facebook at / sixtysymbols
And Twitter at #!/...
This project for the The University of Nottingham is by video journalist Brady Haran
Fantastic. One of my favorite videos on the LHC so far, Brady. To think that they have invented something like "The Trigger" just to deal with the problem of not having the technology to DEAL with so much info, is kinda awesome. Before too long, Petabyte harddrives will be everywhere.
Glad you liked it.
THANK YOU! I've been wondering about what was involved with the data collection ever since I became interested in this amazing machine - and now I have at least an overview that I can understand. Brady, YOU ROCK. :)
It was one of my first questions too!!!
oooh the bit at the end with links to the others is a nice touch.
Thanks for all the videos people from sixty symbols. It's been fascinating watching through them, you're a credit to the scientific community.
This is really interesting to me. You know some people seem to think that because hardware keeps getting better that we don't have to care about computation efficiency, storage space, or actually handling the data in some smart way, but as the numbers go up enough any hardware starts to look puny.
Not that it's anything new or that it would only concern CERN but yeah, really intriguing.
I remember on Reddit speaking to someone in the GIS field in Alberta who said they deal with Petabytes of data daily, they use seismic analysis and it generates huge amounts of data.
it helps us to understand quantum mechanics an how hadrons break up in high energy collisions, to see what elementary particles they are made from, and to see if what we view as 'elementary particles' can be split again.
It also helps us understand antimatter, the only feasable way of powering interstellar travel, besides wormholes.
Congratulations to everyone involved! Now let's get BOINC installed so I can help out!!!
You make the trigger sound so fancy :P The trigger is likely a simple computer program that checks basic properties of each snapshot (or series of snapshots) to see if it's worth saving. It probably includes the amount of energy deposited in each part of the detector, the timing of each event, the amount of detectors that actually detect something, etc.
Think of it like a motion sensor that triggers a light to illuminate when it sees sufficient motion in front of it.
Im going there with my college next march, seriously never been so hyped for a trip ever, I can't bloody wait! It is my dream job to work there :3
The computing power described here is just another reminder of how far technology has progressed. Just think 10 years back, it's amazing how far we've come.
Some of us are, but a channel like SixtySymbols has drawn masses of people who find the content fascinating but may not have the level of understanding we do about certain things :)
My thoughts exactly. I hope so, I'd love to be a part.
I think efficiency and storage space is important. As hardware gets more advanced, so will programs and what we need our storage for. A terabyte of storage means a lot less than it did five to ten years ago.
"Filling up a CD every second." The days when a CD was a relevant unit.
Enjoy...
pretty cool... glad you liked it!
Fascinating
And now we have 6Gb/s M.2 SSDs... how times have changed
Incredibly interesting.
I agree... And lucky that the people talk to me! :)
I imagine they have databases of patterns they get all the time and patterns they want to look more for.
That was, they can find out more about what they want to look for and anything unique that they haven't seen before will get flagged up and stored.
Guy with the helmet sounds like Tom Hanks
Nice video!
This is an invitation to see an artist theory on the physics of light and time!
This theory is based on two postulates
1. Is that the quantum wave particle function Ψ represents the forward passage of time ∆E ∆t ≥ h/2π itself
2. Is that Heisenberg’s Uncertainty Principle ∆×∆p×≥h/4π that is formed by the w- function is the same uncertainty we have with any future event that we can interact with turning the possible into the actual!
Brilliant! Cheers Brady. I was wondering all about this :)
CERN is a big boy, but my own project isn't something to sneeze at either. We at the LOFAR project work at 10 Tbit/s, or just over a TB/s. Unlike CERN, who do "snapshots" we can run 24 hours a day. That's basically 90 Petabyte in 24 hours. We have to filter and average that data on the fly before we can even attempt to store it. The Datastream we store to disk is about 10 GB/s or about 900 TB/24 hours.
This went WAY over my head.
Dark matter is not detected directly, but indirectly through the way it affects regular matter around it (it's gravitational pull). Even if we don't see it, we know it's there because regular matter is affected by its presence. Maps of dark matter distribution in the universe have already been made.
Actually he is right, My SSD's pull through around 8.2GBit/s or around 1035MB/s (2x OCZ Vertex 3 in Raid 0), Even mechanical hard drives can get things done at around 1.6GBit/s or 200MB/s (referring to the WD Velociraptor).
Spending a fraction of public money on pure research so that we may finally understand the true nature of our universe is hardly 'useless'. Whilst we should also focus on the areas you mentioned, it is important that we maintain the funding for particle physics and other fields of pure research. For example, if scientists of the past had not devoted their time to understanding the nature of lightning and sparks, we may have never discovered and utilised electricity.
maybe you should run the data centre at CMS with that type of valuable knowledge
300MB/s is 1TB/h, which is the capacity of a modern hard drive every hour. Every day with beams, they fill several of them. 2011 had 73 days of stable beams, which is equivalent to ~1800TB. In addition, it is not sufficient to just store the data - it has to be analysed.
Now I understand exactly how much data was in the SOS Brigade's 436 petabyte logo.
They have to set up the acquisition triggers based on experience and predictions, and, as stated, also record some random events to see if the filters are working as expected.
There simply is no better way to do it with today's level of technology, which this experiment pushed some way further.
Thank you for the answer. But, my fault, I meant how they make the particles collide exactly on the place where the sensors are.
The SBM would have been a good name for it too!!!
thats why they made the minimum bias system that keeps an assortment of random snap shots that don't necessarily fit the criteria they set, to help remove the human error of prediction and keep a, for lack of a better term, minimum bias.
There are more than one filter level. The "interesting" stuff doesn't get thrown away. 1 snapshot = one whole event, not a moment in time (to my understanding).
Aumentar a velocidade dos sinais digitais para as transmissões a longas distâncias e nas reações de automações robóticos.
@xyznexus its a perfectly timed process. Magnets guide them alone their journey alone with the precision and timing from computers. And factor in its not just one proton. Its thousands and thousands of them at a time.
Yes , but the thing is they have to filter the information they want ( which amounts to about 300mb) out of 1 petabyte of information every second and that is very impressive.
awesome... now where are the processors?
you're welcome!
The "pipes" are magnetized. they use electric magnets to acelerate the particles and to keep them in place
thanks
Please considering watching the presentation made by Wolfgang von Rüden
Head of the openlab CERN at U.S.I 2011 (stands for University of Information System) about this very exact subject.
Search "Universe mysteries: hidden answers in billions of petabytes" in the search box."
Discover more about the "trigger"
@sixtysymbols Thanks for the overview, it was a lot of what I wanted to see at the LHC. I am curious if any of the random samples have produce something of interest.
@gasdive YTC: 2DIl3Hfh9tY Leonard Susskind both increased the number of heads hurt and decreased a good number thereafter. By sheer luck, out of the crazy stuff I was looking at recently, I had the opportunity to check it out. Susskind has a bit of celebrity status in the World of Geek; so, I couldn't resist. I quite like it.
Standard SSD seek times can be as low as 0.1ms, meaning in the space of a second, 10,000 random accesses can be performed, I think this is measured in IOPS, Its more the cost, even though they give them billions in tax every year :)
Why don't they allow people to store some of these images on their PCs so they can essentially use all the computers in the world for their "farm". All they have to do is set up a program on a website that retrieves all of the data they cannot store in their own farm, and people can go there and choose how much data that they will allow to be stored on their PC per second. Then, the same software that they use to search for whatever it is they're searching for, can view the images from your PC.
Ups, I commented before Brady asked about throwing away good information :D
I wasn't disrespecting the Greek people.I am Greek myself and I was simply saying that your logo is confusing from our point of view, at least when we first look at it.
I suspect this is a riff on the "Beware the Leopard" sign from the Hitchhiker's Guide to the Galaxy. :)
Awesome vid Brady! :D
C++ will always be a little bit slower than C (probably because it offers more things).. Both C and C++ are the way to go for this.
agreed. really doubt it`s assembler :D or punch-holes. would be nice to see the code though
Any more info on the server farm itself. Obviously they've got PLCs and so forth. Alot of specific hardware before it even gets to servers and clusters that most people know of.
And does it blend ?
Any old processor can render fractals, it's how many you want rendered per second and for how long
@sixtysymbols I think that it is too dangerous to get rid of "rubish" data. Why not store and process the information in a way similar to SETI: to install a program in several computers using their background computer power to calculate the stuff?
You have very nice videos about the CMS and the Atlas, but, what about the other two (Alice and LHCb)? Do you planned some videos base in that experiments? Thanks!!!
That menu idea is good but you tend to see what you want to see, not new discoveries. But, the random shots show them something 'new'.
He mentioned that others could access the data on their grid. How many labs might be linked up the grid world wide?
LOL me tyoo - I got halfway through typing my question before it was answered. Damn - need to work on patience..LOVE THESE VIDS!
Ofcourse, We can see it's gravitational effect out in galaxy! I wasn't really asking for more on distribution of dark matter. I was hoping for explanation of direct dark matter detection. I think it might be cool to see it on a video! :)
would you guys know what the pay for a top of the line particle physicist working at the LHC would be?
this was amazing
Speaking of information, can you get the guys to explain conservation of information in a way that doesn't make my head hurt? Particularly the black hole paradox would be interesting and something about holographic quantum mechanics.
So because they're required to filter the information, they need to know what they're looking for, right? How - if possible - would they then discover unexpected results about other "not-thought-about"-possibilities? Seems like it's mission impossible to deduce something from so much information unless you have a definitive hypothesis, such as the Hicks Boson. Perhaps the vital information escapes the filters simply because we haven't hypothesized any abnormalities in them? Anyway, interesting.
Hi guys love ur episodes. Not sure if you've answered this before but it's a serious question. Why is the hadron collider so big. If your looking for a single event on the atomic scale wouldn't it happen faster and more common if u had a tube
Mico meters in thickness and just a small tube
you also have to realise that this data is coming in at such a high rate that recording the data can become a problem (most HDDs are the slowest parts of your computer). now a solution to that could be splitting up the stream, but that in itself has problems, and the reliability needed for a facility like this must be above average (average being the uptime of youtube). but don't worry, i'm sure a guy comes in everyday to check what needs to be upgraded anyway
this is soo crazy.. i just cant imagine what would happen to technology in 20 more years
Good way to start the video, with a Michigan prof.
Go Blue.
But nice video, thanks
When the hell can I get a petabyte in my system? 70 years ago the computers we would need to run stuff today would fill a warehouse back then. Just Imagine the endless possibilities!
it literally is. its kept at 2 degree Kelvin.
So why is such a big piece of data not important? In my common sense two identical protons colliding with the same momentum should be giving identical products. My head hurts, what determines the effects of the collision? Do certain collisions have the same properties or maybe you change something?
what cables are you using to transfer data?
a petabyte a secoond ftw!!
But can it blend?
How do you make the particles collide always there, especially when they move near the speed of light?
Love your video, but i wanna know more about other topic beside ATLAS(not that i don't appreciate the ATLAS video), how about like dark matter detection? how can we detect something that cant interact with matter?
Thanks
May i ask, is it possible that some of the 'useful' data got accidentally filtered out?
Just so you know, the LHC systems run on Linux or Unix CLI applications.
It's pretty cool but 300MB/s really isn't much data. Some one earlier said his SSD could do it which it actually can easily. I have an fat old RAID0 of 6x2Tb spinning disks and that benchmarks around 700MB/s sustained and 1200MB/s burst. The main issue here is my raid controller being an ICH10R which can only handle about 680MB/s of uncompressable data. If we move up from desktop PC technology to servers, I have used big SSD raids that can do 3000MB/s pretty easily.
They did, and also that it is to see out of a few random samples if they can identify matching patterns manually, although most of the picking is done by the computers. Your point is? Doesn't change the fact it's the ultimate cherry picking, and therefore scientifically invalid.
I bet it could hold enough 1080p movies to watch for the rest of your life and 9001 generations after that withouth pause.
Lots of parallel cables I would presume. Would have to be Multimode fiber too.
There is no doubt about it... The Large Hadron Collider is the coolest thing on the planet!
No they'll get it... I think you need to give Greek people more credit!
Incredible
the storage doesn't impress much, 100 MB/s can easily be done even with consumer grade hardware.
what does impress me, however, is the trigger processing and transmission speeds (of the initially captured data).
say hi to the IT-crowd in the basement, they're heroes.
Probably not because A) It probably doesn't have a Gaming GPU (They might have "GPU's" but in reality they are just cpu parallel processing units) B) These probably aren't running an OS supported by the Crytek engine.
SWEET!
the physics is amazing. But that's also some hardcore computing power & storage....
The number of bytes going through the LHC IS stupidly huge
The guy with the yellow hat at the beginning kinda looks like the guy from the movie "Bad Santa". I forget what his name is.
As It student i have to ask. In what language, or languages the software is written?
Amusing that to store 100 or 300MB per second was a "massive deal" just 5 years ago. And these days consumer level SSDs can easily store >500MB/s
Yeah, they can transfer data that fast, but imagine that much data coming in /continuously/, 24/7. Storing and processing that much data is still a challenge.
Those collisions all look alike. One can see like 3 different things, so how did they end up with +60 sub particles? All on acid?
No because i dont think it have a sufficent graphic card since that would be useless when just calculating mathematics.
being system administrator at CERN would be the best job ever :P