My son is six now, and he is enjoying comic books. I tried to read X-Men with him starting at the Bronze Age with the Claremont reboot since we had just finished the X-Men Animated Series, and several of the storylines were directly from that era. He enjoyed it, but then he kept asking to read "from the start in order" (definitely my son!). So first I tried to use a reading list, then I started diving into the MCP to create my own lists. I realized what I was doing searching the MCP could probably be down with a computer much more efficiently.
What is weird was one of the first things I ever coded back when I was in school was a tool that would go through the MCP and give you a "last in" and "next in" for all of the characters in a given comic. That was probably in 2003, before you guys added the search feature to the site. I was planning on approaching you guys with it, but I got busy at my new job. Eventually as I matured as a developer, I realized the code was crap, and then I saw it had been implemented much better on the site.
So back in the present, I got to work on the tool to help with my ordering problem. I wanted a tool that could be used to determine where a comic sits in relation to another. I knew the data was not granular enough to facilitate this in all cases (partly because the style guide requires you keep the least information necessary, and partly because of things like time travel and mistakes), but for most issues it would be okay. I wrote something just pretty simple in Powershell to get me started. I wrote a short parsing script that goes through the pages and finds all of the pairs of entries where the first is immediately followed by the second in at least one character's chronology. I also have a heuristic to figure out who the most popular character is that has that pair in their chronology and write that out. So then I wanted to eliminate as many loops as possible from the data so that I get generate more authoritative of answers. To do this I wrote a script to find loops of length N within that list. I will eventually write a tool to walk the tree unbound, but I figure solving the smaller loops first would be helpful. When a loop shows up, it is indicative of one of four things:
- A story has a break in the middle, and some of the characters have not been separated to one side or the other of the split (usually just due to the style guide).
- There are multiple ambiguous flashbacks in the story that take place at different times (usually due to the style guide).
- There is a time traveler who has looped back on their own continuity (there isn't really a notation to handle this).
- There is a legitimate mistake in the MCP.
I will say that maybe a git repository (or another change management system) would be a great upgrade to this site. Rather than submitting errors and changes to a message board, you could just allow all users access to a repository, and have some kind of approval system. You would be able to see the changes all in one place with the old and new side-by-side rather than doing all the bolded (<--) and (-->) stuff. You'd be able to have an explanation attached to the change, and with proper tags you'd be able to search for all changes associated with an issue to make sure you aren't violating some past research that was forgotten around that issue. When the change is approved, it can be pulled in directly, and it could even have style guidelines applied automatically, leading to less typos and inconsistent formatting. All of the pending changes will be in a queue, and you'll be able to send changes back for reworking or reject them completely to get them out of your queue. The fact that each row is independent makes this a perfect use case for a change management system.
Anyway, that is just an idea. If you go to a change management system, I think it would be cool. Otherwise, I'll just keep giving updates on the message board and keep using my local repository.
It is good to be back! I'm glad this project is still going strong!