14 Mar, ’21

GChat Logs, Or: Why Google Sucks

by Cal

Today's complaint is about OUR NEW OVERLORD, GOOGLE. See, the other day I started working on a project I've wanted to do for about... yikes, 3? 4 years, now? where I collect all the chat RPs I've been in into their respective storylines, for better organization and re-reading purposes. Now, the vast, and I do mean vast, majority of chat RP I've done over the past ten, twelve years has been on the various incarnations of Google Chat. The search logs for Google Chat histories are a disaster, you basically can't copy/paste them with any efficacy anymore, but B told me that you could export them. What luck!

Haha. Hahahahahaha. Hahahasflkajsfldkajs.

Okay so it turns out there are three distinct categories of logs. Going from most recent to oldest, we have:

Google Hangouts. These logs are exported in a single JSON file. They contain massive amounts of completely unnecessary metadata and need to be parsed as a stream because trying to load the entire JSON object into memory as a manipulable object uses several times more memory than the size of the file because Of Course It Does.

Getting the Google Hangouts logs trimmed and sorted into chronological text logs is, as it turns out, The Easy Part.

Going back one further, we have Single Line Mail Items. Instead of the text line being saved in whatever database they put the Hangouts stuff in, it was stored as an email. Yes. An email. One chat line. An email. But okay. That's going to need an entirely different parser and several additional libraries, and the code I found to do it didn't work (but I think that might be because the mbox I exported has ALL my emails so I'm trying another export with just chat).

Oh, and did I mention, the Google Hangouts export identifies people by some sort of back-end user ID and their display name. The Single Line Mail Items identify people by their display name and email address. Or sometimes just email address.

Lastly we have the old Chat Logs. These are formatted like you would expect chat logs to look, except without any kind of timestamps. Each conversational session has its own log, which is stored as.... An Email. Yes. Between you and whoever the participant(s) in the chat were.

The part I am most exasperated about is the Single Line Mail Items because THEY ARE PART OF THE SAME CONVERSATIONS AS THE HANGOUTS LOGS. Do they have any of the identifying conversation IDs or information in them? I don't know! I wouldn't be surprised if the answer is NO. So while I've figured out the first set, and the last set is fairly easy as I can just save them as individual dated log files, the second set? I have to figure out how to MERGE IT INTO THE FIRST SET. Because they are from THE SAME CONVERSATIONS.

HRhrkjhajghKJEHFjhFSjhf I would say "how does Google stay in business when their development is so GARBAGE" but I know the answer and the answer is they are an advertising company and they make bank on ads.