Notepad.exe Will Snitch On You (full coding project)
ฝัง
- เผยแพร่เมื่อ 27 ก.พ. 2024
- jh.live/plextrac || Save time and effort on pentest reports with PlexTrac's premiere reporting & collaborative platform in a FREE one-month trial! jh.live/plextrac 😎
Free Cybersecurity Education and Ethical Hacking with John Hammond
📧JOIN MY NEWSLETTER ➡ jh.live/email
🙏SUPPORT THE CHANNEL ➡ jh.live/patreon
🤝 SPONSOR THE CHANNEL ➡ jh.live/sponsor
🌎FOLLOW ME EVERYWHERE ➡ jh.live/twitter ↔ jh.live/linkedin ↔ jh.live/discord ↔ jh.live/instagram ↔ jh.live/tiktok
💥 SEND ME MALWARE ➡ jh.live/malware
🔥TH-cam ALGORITHM ➡ Like, Comment, & Subscribe!
mans took nearly a full hour to say "notepad.exe has on-disk retention of the scratch buffer" 💀
Thanks. Thought he was going to say Microsoft looks at the data to give recommendations. Can now spend the hour back to gtd in Emacs on a non ms os.
Yeh that’s why didn’t subscribe and stopped at 1:08 too much air and sound coming out of his mouth
My goodness, how is this a whole hour, I suppose the way he repeated and demonstrated the same thing like 4 times in the first minute should tell me
To be fair, a good chunk of the population has to have things explained and demonstrated to them multiple times because... well... take a wild guess
@@neilpatrickhairless Implying people are not intelligent because they aren't power users is a very arrogant view to have.
Notepad ++ had done that for literally a decade.
as does sublime, but generally those tools aren't going to be installed by someone who isn't already a power user
Not to mention, it's just another place for people to look for sensitive information, and another place that has to be monitored for suspicious activity, potentially.
@@reanimationxp even good old editor is mostly used by power users. normies buy word
... and the Notepad++ cache files is just .txt files.
😅😅😅😅😅😅😅😅😅😅😅😊😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😮😅😅😅😅😅😅😅😅😅😅😅😊😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😊😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅
010 Editor and it's FOSS counterpart, ImHex both have an insanely useful feature called Patterns (or Templates) that make it a lot easier to reverse engineer binary structures by defining them in a C-like struct syntax. It also helps with visualizing or color-coding the specific byte ranges. I'd love you to make a dedicated video about pattern-based hex editors because it's genuinely one of the most useful things for figuring out the layout of a binary format.
that would be nice, I used that a long time ago in HexWorkshop, makes things so much easier.
also, ImHex looks interesting :)
Yea. I wanted to make a template for this, and I actually suggested it, but it takes some time to learn the template script. I have written a few myself, but tbh IDK how I would write the template for this format. It would be a good video, though. Learning bt templates or ImHex patterns.
I used 010 editor for a while and it is very good but it's pattern language felt unnatural and inconsistent. I switched to ImHex shortly after and I absolutely love it now. I still have to use FlexHex if I need to compare two binary files (deletions & insertions) because other hex editors are way too slow.
@@UnrealSecurity My only problem with imhex is that it's still pretty buggy. Scrolling the hex view is a pain, the pattern breakpoints don't always work and sometimes the whole program crashes. Other than that it's pretty good.
@@Sparks621 I have had it crash on me a few times. I'm curious though; what do you mean scrolling the hex view is a pain?
Your videos always get me with this. You present something I'm familiar with and think I know about, so I assume I already know what you're going to find. You present it as a beginner level understanding further lowering my guard, and then you hit me stuff I don't know.
Doing some looking I found the structure seems so far as the magic two bytes, then a delimiter, a boolean if the file is saved or not.
If it IS saved then it has the length of the following path as a singular byte, proceeded with the path itself in, I'm assuming, UTF-16.
Next is the length of the content in variable amount of bytes.
HERE is the kicker, the next 48 bytes are a Keccak-384 hash of the content which seems to start with bytes 0x05, 0x01 then 46 bytes of the rest of hash.
Next I don't know but seems to be more bytes until a 4 byte chunk at the end with the length again.
Then the content ended with a null byte then four final bytes that I also can't track down.
Hope this helps with that hash issue tho!
The 0x05 0x01 are the encoding and the carriage type. :) Also that length at the start is a varint. After the 44 byte metadata structure (which is the structure that you described as the 46 bytes) there's more var ints, which represent the cursor position. They are the same if there is no selection, otherwise it's the selection start and end in chars. Then a delimiter that seems to be the number 1 as a 32 bit in, and then another varint for the length of the content. IDK what else is after that.
All you need to do to disable this behavior is
go to notepad settings and under "When Notepad starts" --> "Open a new window" instead of the default "open content from the previous session".
No more bin files this way.... :)
Yeah I haven't used windows in a minute but last I did, notepad didn't even do that so hopefully there's a disable feature option too lol
Sort of true...but if you shut down your system without closing notepad it retains the text that was displayed and will open it next time.
@@daviddelaney363 You can change that too.
The best thing about Windows is that hard drives can be erased in preparation for installing Linux.
Apparently emacs does this to an extent too, and,I cant remember but vi might too. This coming from a fellow linux user
Your videos always make my day. Keep shining!
Lol that lil screen connect moment feels... Curiously timed 😊❤
reminds me of stuff I had to do with satori back in the day to parse different network packets that weren't well defined back in the day. Lots of trial and error, cutting, looking, displaying, changing! Always interseting to see what John is up to these days!
enjoyed this. love your long form stuff.
Love the shoutout to 010! As a fellow Canadian I'm happy to see them get the love they deserve!
Loved this one John!
I've never commented on a video before, but I had to for this one. It's that good!
Basically, the people who now owns microsoft don't have people's best interests at heart. Let's go reactOS
Love your new video format keep it up😊
This drives me insane. Programmers and security staff insist on changes that make the user experience much more difficult in the name of security, then they do something like this which is bound to cause bigger security issues than anything they resolved with their user unfriendly changes. Notepad history and cache should at least be opt in, with a warning not to type passwords into it in the clear.
lol... it's NEVER about security. No matter how much they claim it is... it just is not. I have been in IT for years now... Security is a "throw away word" to justify some whacky shit... with the end result being insecure.
I write articles and research data for a living. I found this happy added feature out by accident when I updated to windows 11 over a year ago. This has been a feature ever since 11 release. I personally LOVE it. For the exact reason you listed. I am a coder "hobbiest" and I oftentimes work frantically and quick when diving into "rabbit holes" when doing research. This feature has saved my but more than once with its "autosave" feature when writing.
My method when writing is just let thoughts roll out. I ignore misspells and proper punctuation then when I am done spilling my brain on the screen, I go back later and go through it and make sense of everything I wrote. I love just popping notepad open when doing reporting on coding. I can just spill it out with snippets of code in my head along with what I write, knowing of my computer crashes for what I am doing at the the same time on any one of my 3 other screens, it will be saved with every edit I do at any time. Super helpful. LOL
This feature caused my notepad to be in a corrupted state where it was trying to run a formatting operation on dozens of open unsaved files. Took a minute to get the app functional again, I had to go and delete all the cached shit windows was doing in these folders
Awesome & quality content
Well, this escalated quickly to writing a bespoke parser on-the-fly.
hahaha this content is S-tier dude
I could see programming out pattern matching for certain things, as someone that enjoys red teaming I see a lot of "this is terrifying what can probably happen here"
I'd be looking at other things like I can't alter code in another program or the OS will see it as misbehaving and close my program, but what I could do in theory is get pointers to the buffers, if I want to do some weird low level code stuff. I am hesitant to go full nerd with how and what I could think to try, but this could be a scary tool.
They should use the tech in voting systems instead of what is used in modern computers. This is especially important for financial transactions. Why are we using obsolete tech that can be hacked when we have unhackable tech sitting in the voting machines? There has to be a way to incorporate that tech. Voting machines are so secure that they cannot be hacked, even when connected to an unsecured wifi network. That's highly impressive but no one wants to further explore this wonderful advancement of technology. Doesn't make any sense.
39:02 Tip: Its using variable length encoding, High bit cleared denotes last byte. 0xE8 0x02 is (0x80+0x68) (2) where 2 becomes 0x100 (left shift 7 bits) + 0x68 = 0x168 which is 360 in decimal which is the file length. @0x86 it's the text length, The pair @0x7e and @0x80 could be the selection/cursor pos. Selection is empty both values equal. Cursor at the end, values = text length.
keep up the amazing work
There is a closing event for programs in windows, when the x is pressed or the program is exited through normal means, that probably triggers writing whatever is in the textbox window to a file..
12:45 Thanks for warning :)
I don't have Slack, but whenever I hear a Discord notification sound while watching a video I wind the video back a few seconds several times to be sure it really didn't come from Discord. Same would be with Skype, but I rarely hear Skype sounds in videos.
Did something similar. I enjoy decoding and parsing data structures without documentation.
Same, actually! I do a lot of game RE, so this was right up my alley :)
Brings back memories of when I dissected game save files trying to figure out what all the bits of data did and where all the values are stored. It's ironic that a lot of modern game files have better security than this, either because they encrypt the contents or compress it which is almost the same effect if you don't know the compression algorithm.
I found that _feature_ when I edited a batch file. I wondered why was it that the batch file didn't run my new commands, and instead running the old version of it. And when I open the batch file, there it is, the old version of it. After several times opening, editing, saving, and closing it, I become aware that notepad has tabs and within those tabs are several versions of that batch file, courtesy of me opening and closing it several times. I closed them all and searched in notepad setting to turn off that feature before edit the batch file once again. Very troublesome feature.
do 0.bin or 1.bin have data in ADS (Alternate Data Streams)?
Pretty cool! Lots of places to tuck stuff 😜
Thanks a BUNCH, man! 🍷
(well, I guess those places were always there, but… meh 🤷♂️)
I haven’t watched the full video just yet but I am curious what happens if you make the file hidden if saved is it still holding tabs? Is it easily found still?
Don't sell yourself short John, we all just watched a mastermind at work here! Fantastic video.
I wonder how this will work if I were to open a file from removable media.
Like I close the notepad after file is opened and then remove that device.
Hey man great vid! I've literally been searching for over a week to see if unsaved notepad data is stored anywhere as I have some valuable info i lost after a crash. Tried data recovery tools and none of my files are present but I had initially created them in my notepad. For a non coder, how does one run the git code locally to try decipher these bin files cos I just get gibberish when opening in the hex editor, no legible data is shown and if it does its a word or two.
I believe the extra data while you have notepad open is the Undo/Redo data.
so thats why all my skyrim mod ini's were still open in tabs after i saved and closed the notepad
Is it keeping track of the change tree for undo and redo resulting in scenarios where deleting data does not notably change things until the editor is closed?
Looks like it.. . because ya know edit history is a thing like undo/redo are things in notepad.
John!! Please!! Can you explain what the "some nonsense" is you mention in your vid?? I'm literally dying to know because it shows up in my parsing of my own homebrew janky code I play with. But its there for a reason so it has to represent something, right??? (Extra question mark for effect)
You are having trouble parsing these files? Did you check my github for the tabstate-util project? I have been refining it since before the video aired. Might help you figure out why. I need to put together a description of how this is laid out.
Thank you so much!!
I think you should try VSCode with Jupyter Notebook extension for such videos. Sublime Text may be nice for recording but working with individual code blocks that can be run separately feels much more nice for developing that kind of small programs. Like you wouldn't need to open separate python shell to check path bytes instead all that inside separate Jupyter code block and you wouldn't be slowed by thinking about whole program logic but rather work on individual small problem at a time.
that nonsense text might be captured keystrokes. This is keylogging.
I almost wonder if that random mess of garbled bytes when notepad is open is notepad tracking the keystrokes in a buffer as you type, and then the main window close function parses that into text.
Yes, actually, that is what a few other people have concluded. I haven't looked into it, yet, but this does sound like the most likely thing, as it's mess until you close it.
@@nordgaren2358 if that is the case, that’s an even bigger security risk because that essentially turns notepad into a keylogger.
@@nordgaren2358was that a thing in Windows 7's Notepad?
recursively about the 420 ism, but tbh for those of us who are just getting into this kinda thing, this really puts in perspective and insight a lot about certain qualities of binary and hex and how it relates to registry keys and how it all works together, and so im more intrigued because i still am kinda noob but this way of attack made me understand the fundamental underlyings of things a lot differently and probably easier to digest for my braincells....
Notepad++ has the same behaviour, any files open when you close the editor will persist somewhere and be restored on opening the app again.
Notepad is a tool essential, i work and save with it for fun. i am curious about this even more. thank you john..
The buffer might just contain the info for undo/redo when you don't close it. But as soon as you close it, it discards the undo redo history?
Or is that handled outside of the file.
No redo in notepad, unfortunately
@@nordgaren2358 lol what a joke
is this similar to notepad++ ? files open as tabs in notepad++ persist when it is closed and then re-opened..
Win10 notepad recently got this cache feature as well but doesn't have tabs. I had windows update reboot and windows reopened unsaved notepads I had open. With previous reboots, notepad lost all that info in the open unsaved files. I guess it must be pulling from windows state but i haven't looked...
another TH-camr who has made some interesting videos about figuring out the format for binary formats is MattKC and the videos he has made on Lego Island, and the video he made on recovering a corrupted save file for a game he was playing.
Heads up, I only watched the video up to 43:00 before deciding to try a bit of stuff myself, so if you had some revelations in the last 10 minutes I haven't seen them (as of this comment)
To be honest, Reading this a second time, I don't think the info here is that helpful, but it should help someone working on this get a head start. Take everything here with a tub of salt - I'm a uni student and have 0.00 years of professional experience.
Steps:
1. Create a new file and save it.
2. Load up the saved file in notepad and edit it
3. DO NOT SAVE the edits and close notepad. Reopen notepad to verify that the edits were cached (They were). Then close notepad
4. Open the file in a second editor and add some text. Save the file.
5. Reopen the file in notepad and navigate to the tab with the unsaved data.
Notepad notices that the file on disks has edits newer than the cached edits in notepad.
Therefore:
- Notepad (probably) saves the hash of the file on disk + time of last edit as well as the hash of the cached edits and their timestamps. That could be the garbled data before and after the
- I believe that the garbled data in between the delimiters and the data after the end of contents must be some form of hashes + timestamp. Perhaps the timestamp of the edits + the timestamp of the last edits and the hash + timestamp of the file on disk.
I was curious about the 0.bin and .1.bin files, since they are considerably smaller but still follow the same format somewhat (see point 7), I decided to focus a bit on those. I decided to do some tests
Second test:
1. Create a file
2. Open it in notepad and see the cache. One file with a UUID is made.
3. Close the file, we see .0.bin and .1.bin pop into existence.
4. We also see that .1.bin is empty (Zero bytes).
5. Reopen the file in notepad. This usually makes a second (newer) tab. Close that tab so that the original tab is in view.
6. Now close the file without making any edits in the tab.
7. .1.bin is populated! Moreover, we see the same pattern (01 00 00 00) in the .1.bin file - followed by some garbled data.
8. Now repeat steps 5 through 7.
9. We see that the end of .1.bin has changed.
10. If I repeat 5-7 a second time, we see that .1.bin doesn't change, but .0.bin does? Concluding, it seems notepad stores session data alternatively, once in .0.bin and once in .1.bin. The initial session populates .0.bin, the next populates .1.bin, and back and forth.
Also, if you notice, notepad preserves cursor position between sessions, I assume that too, must be stored somewhere in those files or the main one. They're clearly a complete "Tab state" that has all the necessary info to recreate a notepad tab, including where the cursor was, etc.
I got kinda fatigued at this point at it was getting late, but I hope whoever reads this gets a bit of a head start!
Edit: I made some more observations and put them all in a github issue
It makes sense to alternate files when saving session data (second test, step 10). If, instead, it were to overwrite a single file, if a power failure event happens during the process, that entire session would become corrupt and will be lost; if it was a new unsaved file, all that data would also be gone
Mate, what are doing with Connectwise ScreenConnect?
Well geez! I guess I need to reconsider my Notepad password list!
I uninstalled the new notepad. Old notepad is still there but I dont know how to make it available for Open With.
im just confused if they have access to your computers why wouldn't they just load the files directly.
OR is the issue just the fact the files are in the appdata folder?
The saved version seems easier just to say hey what's the file name, read contents of the file. done
if the original file is deleted does the bin get deleted?
they would need to store every opened with notepad file and if path to this file changed it will not read it
Think note pad have some issues for my use. Like I can't get it to view in dark mode only in bright mode... thinking about finding a alternative
I run Linuxmint and I use an app called notes that does this same thing I never thought of it as a security vulnerability though thanks for pointing this out I will take a long hard think about this issue. .
Doesn't Sublime Text have the same feature?
subl will reopen with the contents of a deleted file if subl is open when the file was deleted, then subl is closed and reopened.
Is there a way to completely kick Microsoft out of your computer to take full ownership of every update or any change and to stop any spying?
I was like bro who is slacking me on the weekend
Notepad also remembers what you highlighted. Just highlight some text, close and reopen notepad... It's still there.
Yea, it's part of the tabstate file format
It's easy to see how this could be a very bad "feature" for security. It's common for support people managing machines to open config files and such in Notepad. At any point in the future, someone could open Notepad and see what was written there.
I mean, these same users make a sticky note into a security issue as well. I'm not saying there isn't a possibility of an issue, but I am saying that if you're running system critical configuration or highly sensitive details... maybe ensuring the file is closed properly & won't reopen in the same editor might be the smart play.
For the record (for those not aware), this "snitching" issue is only about the new multi-tab UWP notepad, but all windows versions including Win 11 still come with the classic single-tab notepad.exe (C:\Windows
otepad.exe) based on good old edit control, so just use this one instead and you are safe.
I've installed Win11Pro 23H2 and that option looks like it's gone from NotePad.exe.
I suspect your magic numbers before the text value are likely
Character count, rows, columns
It's a varint. It encodes the number into 7 bits and uses the sign bit to indicate that there is another 7 bits after that need to be accounted for, basically.
I hate Notepad right now given that closing notepad tabs is not something that fits in my workflow. I mainly use Notepad++ but given it has that same feature, it wasn't always the right choice (I probably have about 40 different unsaved text files open in N++. So I used to use both and now Im just annoyed whenever the default notepad pops up. Mainly because even if I open a txt file in it, it won't display that txt I just clicked on, it will just show the last thing it had open meaning I then have to tab over to the thing I just opened
You mean, just like you can recover old notepad unsaved notes!?
Oh my!
Can this be useful for accidentally closed notepad files? (obviously failing to save them / I mean closing the tab which notifies you that if you dont save itll be lost).
I didn't know... I followed and did your AD signup...
*By submitting this form, you acknowledge and agree that you are an active security professional currently affiliated with an organization.
Note: I am not really unless you consider my hobby as an organization...*
Yes, that is the idea in other similar APPs. It's a way to keep state in the APP itself, so it's not a backup function for the txt file you're editing, but it can save lots of time if you forget to save that txt file. The content will not get lost. The idea of keeping state is in particular good for programmers, maybe not so much with notepad, but with real editors.
I don't know the function is as helpful in notepad as in real editors, because notepad cannot really be used for programming, no syntax highlighting and a lot of other stuff missing.
Can you post your python code somewhere please? Would love to play with it.
I used to use notepad to store private things temporarily this ruined the only usecase I had for it.
Yeah, the second I saw that update news, I dropped Notepad like it's hot. I always used to use it to just have a quick scratch place for ephemeral data, copy/paste, edit, etc. And now, it leaks that data. Gross.
Maybe the problem is more that it's opt-out. This is the typical dilemma when adding features to an APP, you can make it opt-out in which case you can be sure that the users will experience it, or you can make it opt-in and do these small tutorials when the APP starts the fist time with small screens telling what is new and which everyone skips.
But yes, it doesn't seem as helpful in notepad as in an actual editor. I also disabled it.
I use nano in a command line interface on a linux machine.
I don't sweat such infiltration.
For what? Saving passwords?
there is a feature in neovim that basically does the same as this
does NPP do this differently?
I tried to open a cache file in Notepad, I got an error message, then I noticed that all the cache files had been deleted
You should be the voice actor for Michael Bay film trailers.
Honestly, I think a cool thing about it would be able to pull data from it without using notepad. Sorta like saving the tabs to their own file on a system close or signout, or even pull it from a non bootable user space. Frankly, I can't tell you how many times a forced reboot just screwed me over with my notepad scribbles of the moment. I think this is an awesome feature.
Vim can do this too, and Neovim does it by default. I think VSCode does it too, and Sublime probably too.
I believe browsers also cache entered form data and only delete it once you submit it or navigate away.
I bet the photo app creates a low-res thumbnail file for every picture you open.
It's almost like any modern application does this caching incase of interruptions
Something similar with some modern day phones which no longer have removable batteries? Whether turned off or restarted...closed apps re-open where they left off b4 closed when device powered on again. Really annoying & yucky. No like it at all...phooey! Thx for vid! Possibly similar activity as notepad, possibly? Logo double shows on power down(longer on, more likely it will double play logo on shutdown...instead of just once when no quirky anomalies). Same mobile update w/same date from last year keeps getting pushed as needed download despite already updating a few times. Camera app opens at boot & randomly opens..screen blacks out for 1sec randomly, often opening camera app. Something seems...not right 🤔 Factory reset or default restore doesn't change observed behaviour for almost 2yrs now.
I finally switched to Windows 11 and was wondering what the hell was going on with that.
Watching your coding artwork unfold onscreen is... depressingly good... That said, Notepad++
It's not particularly interesting or surprising that a program that auto-saves has locally cached files that can recover the contents of unsaved files. Anyone with direct access to your filesystem could simply re-open notepad.exe and see exactly the data that you extracted there. What would be a lot more interesting to explore is whether and when deleting or closing without saving one of these unsaved files leaves easily recoverable artifacts. For example, if you type something into notepad, close the tab and hit don't save, does it delete the file? Does deleting the contents delete the contents of the file? If so, when?
Ya I used to like a lot of stuff before Microsoft changed it. WIll have to get a diff simple editor.
Great. Another thing that'll be added to infostealer malware
Hold on, let me check my slack really quick... oh wait...
You can use a hex editor. It will show you the hexadecimal numbers on the left side and the text equivalent on the right. Incidentally, this isn't new since Windows tends to cache out memory to disk and if have a good disk editor and a kernel debugger, you can pull that information from the disk cache. That's why if someone gets a hold of your machine, it's generally game over unless you encrypt at the boot level the entire disk.
Python is so fun to screw around with ngl
here is the better solution for me to display the text with line breaks:
original_file_contents = original_file_contents.decode('utf-16')
# use splitlines methode for correct format of Carriage Return (CR) and Line Feed (LF)
# characters (often abbreviated as "
") are used for line endings, while Unix-based
# systems only use the Line Feed (LF) character ("
").
lines = original_file_contents.splitlines()
for line in lines:
print(line)
otherwise part of the text will be missing after conversion to utf-16
The buffer text in the tabstate files all have unix type line feeds. The tabstate converts all text to unix type carriage returns and utf16le, no matter the source files encoding or carriage return type.
part of the text is probably missing because you are not reading the var int. it's uleb128
I don’t own a pc anymore but would be interesting to see the tabstate on a password protected file?
Probably would be plaintext in the tab state, but good question.
People probably complained that the old notepad lost everything if there was a power outage or some other fault causing the computer to shut down so they added the ability for it to 'remember' what you were doing. Should be optional though.
why were you doing this at 1-2 am
This behavior is also in Notepad++, for years…and BBEdit
Soon as I noticed it I just close the tab.
Make sure to update to the latest Bleeding Edge first..
It probably should be noted this is the notepad app and not the native notepad everyone is familiar with. Windows apps work differently then the traditional software that most are accustomed with windows and they are not the same.
"It's a feature not a bug dammit" - William Henry Gates III
Does it Mean anything If i localize windows
my immediate thought on seeing the data that changes every save even when you dont make changes was "timestamp" but that does not appear to be the case. Decoding it as a string doesn't give anything resembling a date and decoding it as a number doesn't either. It's either wildly to low or to high to be a unix timestamp, and doesn't work as a straight date either. Hmmmm.
Yea. It definitely has some kind of time information, but I have no idea what format they are storing things in. You can see the last 4 bytes moving as time goes on (wait an hour, save and you'll like the 5th bye increase) and the first four bytes almost seem random. I have no clue, tbh. :(
Now do this for Apple Notes or Goggle Keep!