My records show three cases where solitary strips are duplicated in GoComics precisely one year after their original publication, apparently in error. Of these, two (published Jun 7, 1976 and Jul 12, 1976) are unique to the GoComics dataset, while only one is shared with the Universal Uclick reprints service:
If GoComics and the Universal Uclick reprints service do not mend their archives, this inherited deficiency will eventually have to be remedied by reference to the original strip, as found in released collections or newspapers archived in libraries or museums.
NULL Date Value
Following implementation of the combined Search By Date/Storyline interface, I began seeing occasional errors in my Apache Error Log suggesting that in some cases an invalid Date value was passed to the Starting Date (and possibly Ending Date) control. I was unable to replicate this error, and lacking any user reports settled for adding some basic additional debug code to identify the erroneous value (turned out to be NULL) and confirm where it was used.
Absent user reports, the only way to track down this bug will be to add an input debug mode that preserves *all* original request data (perhaps in session variables?) in hopes of catching the circumstance that precipitates it. This will require substantial changes to my general and project-specific code files, and will therefore need to wait until I'm finished with my current content-adding phase, and ready to return to coding additional functionality.
I have recently made several profound changes to the project itself and the shared environment on which it relies. I have done as much testing as I can stand, but subtle bugs likely still exist, and the increased traffic to the project from having exposed Strip, Character, Location, and Storyline dossiers to search engine indexing makes these bugs more likely to present. I must keep a close eye on my Apache error log.
The scene system is too rigid, in that it requires that each panel (itself a problematic construct in some cases) be associated with a single scene, and thus a single location. Some strips are impossible to accurately transcribe in that way. The first panel of the strip published January 11, 1993, for example, uses two separate locations. Other strips would be difficult to transcribe as such. The strip published February 1, 2004, for example, features an intermixture of two separate locations, which would each be rendered as one scene. However, characters that originate from the Washington, D.C. scene -- Rick Redfern and Donald Rumsfeld -- are shown speaking from the television in the other scene, featuring Zipper Harris at (assumedly) Walden College.
This would require that those characters *also* be associated with *that* scene, leading to repetition of data in addition to the unfortunate complexity of "trading off" scenes every panel.
The solution will be to retreat from the Scene system, and instead associate locations used as setting (which I should simultaneously rename as "depicted locations") directly with strips, in the same way that mentioned or referenced locations already are. At the same time, I should modify characters (which should also be directly associated with strips) to include an optional location indicator as well, which will (again, optionally) indicate which, if any, of the depicted locations the character appears in. Scene-based text, such as signage, can be associated with each other using a separate relational table, and removed from the flow of dialogue in the strip map programmatically.
Occasionally, Trudeau has drawn a week of strips that went unpublished. On one occasion, for example, a satire of the anti-abortion propaganda "Silent Scream" drawn for publication sometime in June 1985 was refused by his syndicate. On (at least) two other occasions, Trudeau withdrew or truncated storylines that had been outpaced by current events. The "ink spill" storyline published starting June 12, 1989, for example, replaced a hopeful (albeit rather patronizing) storyline focused on the Tiananmen Square protests, written before the crackdown. On September 17, 2001, Trudeau withdrew a storyline satirizing George W. Bush's perceived lack of intelligence and curiosity in light of national sentiment following the September 11 attacks. In this latter case, the withdrawn strips were replaced with reruns.
Where these alternate strips are available, and especially when their intended publication can be traced to a specific date, this project should compile data on them as well.
Automated Name Differentiator
When two characters with identical common names occur in the same context, such as in search results, they should be differentiated from each other automatically using some distinguishing characteristic of their use. For instance, for two characters named "Alumnus", attending a Walden College reunion in 1975 and 1976, respectively, when they occur in the same context they would be labeled "Alumnus ('75)" and "Alumnus ('76)".
Additional Options For Adding Find A Strip Characters
When a user is viewing a specific Character record, that character should be added as an option in the Add New Character select box. Similarly, when a user has employed the (as yet uncoded) "Find A Character" function search for characters, the characters(s) returned by the latest search should be added as an option to the Add New Character select box.
I've added a rudimentary captcha system to the feedback forms in this project. Time will tell whether it will suffice to prevent automated spam. In the future, I should hook it into the contributor scoring system, to remove the requirement from those surpassing a certain threshold of helpfulness.
Basic Usage Info
Rather than simply not displaying a remove/delete form/button when circumstances like usage or subordinate data prevent deletion, I should display the button, disabled, along with a list of the factors that inhibit deletion.
A complication to this ideal that has already bitten me in the ass (although I didn't realize it at the time) is when data that is otherwise unused is referenced using tags in strip summaries/notes, character blurb/biographies, documents, storyline summaries, or other text contexts. I am not certain it is feasible to run a check for usage on such text contexts. Even if database speed and resources permit, a quirk in tag construction as simple as an extra space between the tag name and the id/date attribute could screw things up, and if a tag is permitted to have multiple attributes, the various possible orderings of these attributes could also complicate the search.
One possible solution to this, which would unfortunately require massive changes to the markup validator, would be to (a) enforce a prescribed order to tag attributes, and (b) construct a new version of the input upon submission with standardized formatting. This new version could be placed in a class property, for retrieval, or (if I permit the key to be passed by reference) automatically swapped with the unformatted version.
On the same note, when factors prevent the construction of one control - such as the Change Scene or Move lists in the Edit Panel form - rather than silently disappearing the offending control I should replace it with a Freeform message listing those factors. When the data is integral to use of the form, the remaining controls should all be disabled, preventing form submission.
Combining Form & Input Profile Construction
A more workable solution for the short term, and one that I have already begun to implement, would be to outsource repeated code to class methods/properties that could be called from any context. I have already begun to do so in my largest project code file, strips.php, but had to pause when my changes began to account for and utilize my nested Ascription system, which I then had to replace with a flat Attribution system.
Standardized Control Swapping
Sometime in the mid-1970s Garry Trudeau standardized the storytelling format of Doonesbury, such that all six "weekday" strips in a given week would relate in some fashion to a single topic or storyline. Sunday strips were very rarely devoted to these storylines, and usually reflected their progress only in the broadest sense -- to match Uncle Duke’s various occupations, for example.
A single storyline may occupy only one "week" of strips, or it may continue for several weeks. A character-driven storyline might be dropped for a time to give space to another topic or storyline. Rather than end, one story could segue smoothly into another, as the Truckers’ Protest does into the introduction of "energy czar" William E. Simon.
When I separate strips into discrete storylines, I should look for cause to do so at the single-week level. If one week of strips deal with the participation of the Walden Day Care children in Joanie’s wait for law school admission, and another deals with Joanie’s depression and malaise while waiting, the two weeks should be part of separate storylines.
When I separate strips into discrete storylines, I should also do my best to split separate but related storylines/topics, regardless of Trudeau’s attempts to segue or otherwise link them together.
In addition to an ordered list of strips, each storyline has an optional Summary field. This field should only be used for one of two defined purposes: to summarize the plot of very long, multi-segmented storylines and highlight pivotal strips within, and to provide detail and context that would otherwise need to be included in the summary for each member strip. It should, in other words, be used to minimize repetition, rather than to increase it by aggregating the summaries for each strip.
Addressing (Portions of) Strip Text
With the replacement of <noise> tags with unicode "breath mark" symbols and & entities with naked ampersands, the great majority of my transcribed strip text now corresponds precisely with strip content, which has long been a goal of mine.
Unfortunately, this system now lacks a means of indicating format or style -- lines of text or portions thereof that are rendered in bold for emphasis, or are notably larger than their surroundings, or are rendered in cursive, in typed monospace, or (as with Phred’s early correspondence in Vietnam) in clumsy block letters.
Although the intended purpose of the addressing system has since expanded, I first conceived it as a means of noting such differences in styling in my transcriptions.
It involves at least one new table, Strip_Text_Addresses, which in its simplest iteration would be constructed with the following fields:
If the styling applies to the whole text, only the first three fields are needed; the rest will be set to null. ID is, of course, the unique numeric identifier for the table record. Text_ID identifies the line of text to which the styling applies. Styling indicates the style ("Bold", "Larger", etc.) applied to the text.
If the styling applies to only a portion of the text, the four Content fields come into play, and together they provide two independent, redundant means of addressing interior content. The Start and Length fields address the styled portion of text by its numeric, zero-indexed position and length. The Text field holds an exact copy of the text expected within the substring. The Number field indicates which occurrence of the Text substring in the full line of text is meant to be thus styled.
Using two separate means of addressing interior content will make the system less likely to fail with minor changes to transcribed text. If, for example, the addition of some previously-missed comma or period invalidates the Start and Length-based addressing, such that the substring indicated no longer matches the stored Text, the system can, at the user’s direction, generate new Start and Length values using the Text and Number-based addressing.
Uses of Addressing
In addition to styling, Addressing can be used for many other purposes:
To tie Character and Location mentions and references to specific portions of strip text, rather than to the strip entire.
To note strip text with valid meaning in a foreign language. In a separate table, the project could retrieve/store a machine translation, and a more context-sensitive human translation where necessary. Three instances where the French language is used in the strip spring quickly to mind: Feb 16, 1971, Jun 20, 1973, and Dec 29, 1973.
To note non-textual aspects of strip content that cannot be accurately transcribed, such as when text is accompanied by musical notation.
To note portions of strip text of uncertain meaning, or that refer to unknown people or locations, or that are otherwise problematic. An example of this is are the two unknown people referenced in the fourth panel of the strip published Jul 23, 1978.
To provide full-text equivalents for abbreviations and acronyms.
One of the primary purposes for the list of characters that appear in each scene (which were earlier assigned per panel) was to ease text transcription by limiting the list of characters that may be attributed to a small subset of the larger cast. Advancements like the Search system, however, have rendered this purpose unnecessary. Sometime in the future, I should repurpose this list of characters so that it includes only those who recognizably appear in the strip. Likewise, the lists of subject characters and locations, as currently constructed, may become unnecessary as these become tied to specific portions of transcribed text.
Subject Matter for Miscellaneous Topics
Subject Matter for miscellaneous topics can facilitate wider thematic linkage for strips that are not part of a discrete storyline but share a major theme or topic in common. Such themes are usually discussed in multiple contexts, from multiple points of view, and sometimes without any direct reference to the topic by name, rendering a search using strip text insufficient. Such topics might include subjects like "Vietnam War", "Watergate", and "The Draft".
Topics should be implemented only when nearly all other features are completed, both to more accurately gauge the topics that are not already sufficiently grouped by storyline, character, or location, and also because topics could then theoretically be attached to storylines and locations, as well as to individual strips.
Given the complexity of the hierarchical location system, and particularly with the addition of secondary parents, I should develop a sturdy ruleset for determining which parent should be primary, and which should be secondary.
The placement of certain fictional locations requires special explanation. I should either create and populate database fields to this effect, or compose a new document.
Sometime in the future, I should find a way to make locations (usually landforms) that lie partially within one or more locations subordinate not only to their Primary Parent (as the Potomac River is to the United States), but also to each component location in which they partially reside (as the Potomac River does in Washington D.C., Virginia, and other states).
Locations For Fourth Wall-Breaking Strips
Doonesbury broke the fourth wall on occasion almost from the beginning of the strip. One early example that springs immediately to mind is Oct 30, 1970, only the fifth strip published. At some point, however, the characters began to evidence some knowledge that they not only had an audience, but that the action of the strip was staged for that audience. Entire Sunday strips were occasionally addressed to the strip's audience, with cast diagrams and such, and a whole week of strips might be devoted to Mike and Zonker responding to reader mail.
I believe the first evidences of this change were Jan 2, 1983, the last strip before Doonesbury's eighteen-month hiatus, and Sep 30, 1984, the first strip after its return. In these cases, I am uncertain whether to assign them the location of the White House - where they are set within the interior strip continuity - no location at all (which might be the most accurate), or a special top-level location (at the same level as strip continuity's "Earth", "Skylab", and "Moon") reserved only for fourth wall-breaking strips.
By using storylines and subject matter to "filter" strips, I can ensure that the construction of strip summaries remains consistent, with similar terminology used in similar scenarios throughout. I can use simple queries on strip summaries to find and fix technical and stylistic errors, like double spaces and improperly constructed quotes.
Refine Name Search Query
The character and location name search queries will successfully return a result set ordered by prominence, but the order in which it searches for results within the names associated with a given Character_ID or Location_ID is unknown, and perhaps haphazard.
Ideally, this order, which I am not sure how to manipulate using the single query, would be as follows for characters and locations, respectively:
Context = "Common", "Full", then "Birth", all while Placement = 0.
Placement ASC, while Context = "[None]"
Context = "Common" and "Full", while Placement = 0.
Placement ASC, while Context = "Historical"
Placement ASC, while Context = "[None]"
Panel Signage vs. Scene Signage
Scene signage is an abstraction of how the text actually appears in the strip. A useful abstraction for eliminating repetition, but an ultimately harmful abstraction all the same, since it detracts from Strip_Text's intended purpose as a dry transcription of all text that appears in each strip.
A more accurate system for eliminating certain forms of repetition while retaining the necessary granularity would be to use a separate table to associate appearances of identical signage, both within a single strip and more generally.
One challenge I experience while transcribing signage text that is barely legible on its own is that graphical context in the same strip, location, or storyline often makes the text content obvious... and once the text has been deciphered by these means it is impossible for me to consider it merely on its own behalf.
If this project is eventually granted official license to use strip images, maintaining separate records will make it easier to develop a crowd-sourcing system for confirming my transcription of strip text given varying amounts of context.
The throwaway panels, the first two panels associated with a Sunday strip, which may optionally be excised if the newspaper editor desires, currently comprise the first Scene of each Sunday strip. Eventually, I may be able to use this knowledge to restrict text in those panels from text searches, or to indicate when characters associated with a Sunday strip appear only in the throwaway panels.
Certain characters in the strip appear under an alias for large swaths of time. Rufus Jackson, for example, calls himself "Thor" for more than two years, and Benjy Doonesbury calls himself "Sal Putrid" for even longer (though this happens beyond the confines of the current dataset).
I've already taken the most critical step to account for aliases by adding multiple names to characters and allowing users to search for them in the Find A Strip function, but I should also eventually allow for the appearance of an alias, instead of the common name, in the record of those strips in which the character appears under that alias.
Scene System too Rigid?
The Scene system is a useful compromise between setting Location/Characters per Panel and per Strip, but it is at times too rigid and inviting of repetition. Where the scenes exist in Sunday strips to distinguish between throwaway panels and those needed to convey strip meaning, they do not always indicate a change of cast or locale.
I should perhaps make it possible to set locations and characters by either Strip or Scene, depending on need, and manage that setting using a database field attached to each strip.
Also, "Scenes" should be renamed and repurposed as "Divisions" to reflect the shared use of the construct to mark both instances where separate locations and casts of characters occur within a single strip and to distinguish between throwaway panels and Sunday strips.
AJAX-enabled user interface
I should eventually seek to submit form data and update the page using AJAX.
Find A Character
I should add a "Find A Character" function to my Doonesbury Navigator project, equivalent to the current "Find A Strip" function, which will allow filtering by the following criteria:
The function of these constraints will be substantively identical to the current "Search" functionality in "Find A Strip" "Character Constraints". Like "Find A Strip" "Text Constraints", however, the user will be able to see and individually manage the active constraints.
Constrains the result set to only characters that appear and/or are mentioned within a given strip date range. Optionally, this can be made exclusive, returning characters who *only* appear and/or are mentioned within the date range.
Constrains the result set by various properties, including Gender, Origin, and others as I think of them.
When Find A Character constraints are active, the user should have the choice to use them, rather than the independent search function, to populate the "Find A Strip" "Character Constraints" search results.
Manner of Character Use
To more accurately reflect which characters in the Doonesbury universe have actually met, I should add a new field to the relational table that associates Characters with a Scene. This field will track whether said character is Present or Not Present. The former applies to characters that appear in a scene or speak from off-panel; the latter to characters whose speech is broadcast via radio, television, or some other means.
Tracking character entrances and exits within individual scenes using the database equivalent of "stage blocking" may help better gauge which characters have met in the Doonesbury universe, but at such cost and for such limited application that it is probably not worth it.