Minutes: 111009 Morning
From zedwiki
IRC Live Notes - Tuesday Morning
[09:10am]Boris_G:Tuesday morning: going through results of yesterday's small group sessions [09:10am]Boris_G:Video support group [09:10am] Romain joined the chat room. [09:11am] cwal joined the chat room. [09:12am]Boris_G:See http://www.digitaltalkingbook.com/zw/ZedDist_Design_Goals for actual requirements generated [09:12am]Boris_G:Overview of use cases [09:13am]Boris_G:First use case: video is master -- the publication is based around a video [09:13am]reubenfir:http://www.digitaltalkingbook.com/zw/ZedDist_Video_UseCases_Requirements#Overview_of_Use-Cases [09:13am]Boris_G:text for captions and/or transcript [09:14am]Boris_G:audio can be used for audible description (with text alt) - need to pause the video for this if there isn't a long-enough silent gap [09:15am]Boris_G:Use case: sign language / lip-reading is master [09:16am]Boris_G:text channel can syncronize with this (but note with sign lanaguage text would be a translation, may not synchronize easily - no 1-to-1 mapping) [09:17am] JoshA joined the chat room. [09:17am]Boris_G:Eg, may be highlighting discontiguous chunks of text which all map to single sign-language expression [09:19am]Boris_G:May have sign language and lip reading video - two simultaneous video channels. Users generally would choose one or the other, but we're not excluding teh possibility that both are displayed [09:19am] JoshA left the chat room. (Connection closed) [09:20am] cwal left the chat room. (Read error: Operation timed out) [09:21am] Romain_ joined the chat room. [09:21am] Romain left the chat room. (Quit: http://chat.efnet.org ) [09:21am] Romain_ is now known as Romain. [09:21am]Boris_G:3rd use case: video is secondary - eg short videos interspersed in a text/audio daisy book [09:22am]Boris_G:eg disaster preparedness information -- illustrate key ideas with video, animation [09:23am]Boris_G:Video can be displayed in separate window, in separate region, or in-line in text. In-line presentation should not be required of players [09:23am] KennyJoha joined the chat room. [09:24am]Boris_G:Including video inside text's visual layout would put timed information into an otherwise static space. [09:25am]Boris_G:Avoid ambiguity in terms of who is in charge of the timing - SMIL should be in charge [09:26am]KennyJoha:Speech fully disabled. [09:27am]Boris_G:So, use case 1 = accessible motion picture; starts with video as primary asset [09:27am]Boris_G:use case 2 = a sign-language book or similar publication [09:29am]Boris_G:There will be other scenarios, but these use cases should generate requirements that allow other possibilities [09:30am]Boris_G:eg, could have a video + a video track of speaker's face just for lip reading [09:31am]Boris_G:How effective is lip reading compared to sign language? -- depends on the person [09:32am]Boris_G:In a filmed play, might need to get someone to re-voice since faces can't necessarily be seen [09:33am]Boris_G:Group that needs lip reading is not that big, but there are people who need it [09:33am] kcreasy joined the chat room. [09:34am]Boris_G:There is a technique of creating a synthetic face - could be something to try [09:34am]Boris_G:IBM product [09:35am]Boris_G:Could also be useful as training materials for learning sign language or lip reading [09:36am]Boris_G:http://www.digitaltalkingbook.com/zw/ZedDist_Design_Goals#Video_Support [09:36am]Boris_G:*Requirements* [09:36am]Boris_G:Need to define at least one video codec that user-agents must implement [09:37am]Boris_G:Likely to lead to heated discussions! [09:38am]Boris_G:2.1 accurate frame-based video positioning - must have precise links to time points [09:38am]Boris_G:sign lang. can be very fast [09:38am]Boris_G:Link It does this with frame-based notation -- but this depends on the codec [09:39am]Boris_G:2.2 Accessible Labeling [09:40am]Boris_G:Q: has problem of frame-based video positioning been solved before? [09:41am]Boris_G:SMIL uses NTP "Normal Presentation Time" [09:41am]Boris_G:xx NPT [09:41am]Boris_G:We're not aware of what extensions / other possibilities there may be [09:42am]Boris_G:2.2 Accessible Labeling [09:42am]Boris_G:Associate multimedia labels with sections of the document -- like DAISY 3 resource file but extend to more media types [09:44am]Boris_G:After 7 years, nobody really using resource file [09:44am]Boris_G:Should we get rid of it? Something is wrong... [09:45am]Boris_G:SMIL 3 has "label" attribute for this purpose [09:45am]kcreasy:roducers use only as much DAISY as it takes to get the job done. Resource files are not required and are extra work. [09:46am]Boris_G:label has URL fragment - can point to any media object [09:47am]Boris_G:Simpler for players, since information is all local rather than indirected through resource file [09:47am]Boris_G:3.1 Volume Control [09:48am]Boris_G:Author can define default levels for individual streams of audio [09:49am]Boris_G:Users should be able to adjust individual track levels as well [09:50am]Boris_G:You may not be able to modify video assets you are republishing - so need to adjust volume in SMIL [09:50am]Boris_G:Need to label tracks so that users can know what they are & modify them as they need [09:52am]Boris_G:Semantic values of what the various tracks are *for* -- something higher level than SMIL [09:52am]Boris_G:3.2 Speed and pitch control [09:52am]Boris_G:This would be adjusted for presentation as a whole, not individual tracks since they are synchronized [09:53am]Boris_G:SMIL should help ensure consistency of the time graph [09:54am]Boris_G:Finding good components to adjust speed of audio has been a difficulty for implementors [09:54am]Boris_G:will this be even harder with video? [09:54am]Boris_G:(the only easy method for audio is the "Donald Duck" method...) [09:55am]Boris_G:HTML5 has media rate control; DirectX / Windows Media Player may support this out of the box [09:55am]Boris_G:3.3 Layout and positioning control [09:56am]Boris_G:Presentation layouts defined by author; customizable by user [09:57am]Boris_G:Semantic labels on the different streams will be helpful here [09:58am]Boris_G:Make sure videos are compressed in a way so that they can be shown at a reasonable size - to see face & hands, etc [09:58am]Boris_G:Adapt to playback on mobile phone vs. desktop syste [09:59am]Boris_G:xx system [09:59am]Boris_G:May need alternate renditions of same video track [09:59am]Boris_G:Different versions embedded within same container [views], or different editions for different devices? [10:00am]Boris_G:Same problem as editions with / without images. [10:00am]Boris_G:Spec doesn't have to talk too much about this; will be worked out by distributors [10:01am]Boris_G:Would speed up of sign language track potentially change the meaning? [10:01am]Boris_G:No -- but could definitely hinder understanding [10:02am]Boris_G:Depends on how it's filmed. If you are speaking very clearly, you can change the speed. But there's a limit to how fast or how slow you can go. [10:02am]Boris_G:If too slow, lose the thread of the message [10:03am]Boris_G:In audio, 2x normal speed is relatively common [10:03am]Boris_G:Haven't really experimented with this in sign language, but depends on the signer & how clear they are [10:04am]Boris_G:Best if the user can adjust and see what works in each case [10:04am]Boris_G:Sampling rate on the video must be sufficient to enable speedup [10:05am]Boris_G:Image quality will matter too -- lighting, etc [10:06am]Boris_G:Consider separate windows / separate screen presentation. Multiple screens will become more common. [10:06am]Boris_G:3.4 Pause / Resume [10:06am] danielwec left the chat room. (Ping timeout: 240 seconds) [10:06am]Boris_G:One video stream may need to be paused / padded to allow another stream to complete (eg, audio description) [10:07am]Boris_G:SMIL has mechanisms, but not sure about complexity, implementation [10:08am]Boris_G:Many potential ways to do this, but we should dictate one way to do it. [10:08am] danielwec joined the chat room. [10:08am]Boris_G:3.5 Channel / Track selection [10:08am]Boris_G:Channel can be turned on/off by user - eg turn captions, audio descriptions off [10:09am]Boris_G:Q: can there be multiple SMIL documents? Yes, but currently one "logical" document broken up into time segments [10:10am]Boris_G:Maybe there could be alternative documents to implement different views? [10:10am]Boris_G:Would be difficult to change from one view to another mid-stream in that case [10:11am]Boris_G:Common practice to have 1 smil file per chapter - or other conventions in different organizations [10:11am]KennyJoha:Speech fully enabled. [10:11am]Boris_G:Note to us: consider recommending a guideline here [10:11am]Boris_G:Large SMIL files are problematic for players [10:13am]Boris_G:Don't want to try having a SMIL file for each permutation of channels on / off - could be a lot of combinations [10:13am]Boris_G:Could we have a higher-level language which is used to generate a SMIL document on the client side [10:14am]Boris_G:May want to build, eg audio-only editions for a player that has no video capabilities - download much smaller document [10:16am] cwal joined the chat room. [10:16am] cwal left the chat room. (Client Quit) [10:16am] cwal joined the chat room. [10:18am]Boris_G:For turning tracks on / off - would you use SMIL CustomTest? No simple way to do this. [10:18am]Boris_G:SMIL doesn't have a concept of a channel - just little clips [10:19am]KennyJoha:3/clear [10:19am]Boris_G:Directive is that authors ned to be able to label tracks in terms of their role - so user can choose [10:19am]Boris_G:4.1 Semantic annotations [10:20am]Boris_G:annotating between media types - "this is an audio description of that" [10:21am]Boris_G:Questions [10:21am]Boris_G:Q: How evilly complex would a SMIL engine be to support all this? [10:22am]Boris_G:AMIS - a real SMIL engine [10:22am] KennyJoha is now known as kennyjoha. [10:22am]Boris_G:But most DAISY players just implement a small subset. [10:22am]Boris_G:Eg, for text-only you can ignore SMIL altogether [10:23am]Boris_G:Tricky part is juggling different media being in charge of the timing. Video per se doesn't make a big difference. [10:23am]Boris_G:But if user can turn of a track and that changes who is in charge of timing, that is tricky [10:24am]Boris_G:Timing Events - Syncing and slipping - this is very hard. [10:25am] cwal left the chat room. (Ping timeout: 480 seconds) [10:25am]Boris_G:Lip reading video & audio, if in separate tracks - need absolutely precise synch [10:25am]Boris_G:Need fine-grained control, can't just hand off to separate playback engine [10:27am]Boris_G:Do we need a way to break out of SMIL in some cases? Sometimes declarative markup is harder than creating some active code [10:27am]Boris_G:Link-it uses something more like style sheets [10:28am]Boris_G:SMIL based on strict containment - nested par & seq elements [10:28am]Boris_G:Makes overlapping things hard [10:28am]Boris_G:Might be easier to have something like a database of facts / metadata about the various assets [10:29am]Boris_G:"This video range and this text range are associated" [10:30am]Boris_G:Maybe some sort of timesheet - shows channels each on a line, and lists points of correspondence [10:30am] mgylling left the chat room. (Quit: mgylling) [10:31am]Boris_G:There may not be any player / user that actually uses all track of information. [10:31am]Boris_G:How can we make it easy to flatten SMIL easily into a simpler timeline [10:32am]kennyjoha:Speech fully disabled. [10:33am]Boris_G:Is there a distinction between authoring and distribution here? [10:33am]Boris_G:Maybe SMIL is relevant at one point in the process. [10:33am]Boris_G:Thanks to video group for tremendous work! [10:33am]Boris_G:[15 minute break starting] kennyjoha is now known as kenny_j. [10:41am] danielwec left the chat room. (danielwec) [10:50am] cwal joined the chat room. [10:52am] cwal left the chat room. [10:52am] mgylling joined the chat room. [10:52am] mgylling_ joined the chat room. [10:57am] danielwec joined the chat room. [10:57am]Boris_G:Marisa: interactivity-related requirements [10:58am]Boris_G:DAISY 4 wants to support educational content -- testing, quizzes. Also things like tax forms, order forms [10:58am]Boris_G:Requirement: timing in tests [10:59am]Boris_G:Generally user agent can override any timing in document; though tests want to specify time constraints [10:59am]Boris_G:Req: dynamic content flow. [10:59am]Boris_G:Subsequent content can depend on answers to questions. [11:00am]Boris_G:A branching tree of questions [11:01am]Boris_G:Risk of creating really confusing navigation model - [11:01am]Boris_G:What are implications for NCX? "Section 2" is not a single, static thing? [11:02am]Boris_G:Extreme example - "who wants to be a millionaire" implemented as DVD menus [11:03am]Boris_G:May be better to allow this to be a specialized application that someone builds, rather than us adding the logic in to general documents [11:03am]Boris_G:Req: security and encryption [11:03am]Boris_G:Test require prevention of unauhorized access - documents and answers submitted [11:04am]Boris_G:Req: Locking portions of exam - no going back to chnage answers in earlier sections [11:04am]Boris_G:Though this would conflict with normal navigation model [11:05am]Boris_G:Req: accepting input from users - a submission model [11:05am]Boris_G:Want to store answers in a reasonably flexible format to enable various use cases. [11:06am]Boris_G:(but does this conflict with security?) [11:06am] Kenny-j joined the chat room. [11:06am]Boris_G:Would prefer not requiring submit button - have information saved automatically [11:07am]Boris_G:Req: incremental submissions. Not necessarily always saving a complete set of answers. [11:07am]Boris_G:Req: group contextually-related components. Question, answer, perhaps also data on which the question is based [11:08am]Boris_G:Ability to point from submitted data back to question. [11:08am] kenny_j left the chat room. [11:08am]Boris_G:Loading submissions - should be able to load up a set of answers. Could have multiple sets of answers to same form (eg my taxes, your taxes) [11:09am]Boris_G:Extensibility of submission format - a flexible format should support things like aggregating answers of different users. Details of how to do this not part of daisy spec. [11:10am]Boris_G:Types of input. [11:10am]Boris_G:Typical input types: multiple choice, text, numeric range [11:11am]Boris_G:"Two-way accessibility" -- if we have sign language books, should accept sign language input [11:11am]Boris_G:Common form languages don't have eg a video input type, however. [11:12am]Boris_G:Note: just talking about raw video submission - not machine-translated sign language! [11:12am]Boris_G:Mathematical input - in math test, need ability to input some math as answer [11:13am]Boris_G:Needed to support higher level math testing - but this is definitely complex [11:16am]Boris_G:There are various math input user interfaces - wondering if there are accessible ones [11:16am]Boris_G:Will need things like echoing of inputted text [11:17am]Boris_G:user agents will want to link with existing assistive technology [11:17am]Boris_G:Forms and presentation flow -- will be different than flow of, say, a novel [11:17am]Boris_G:Things like feedback on inputs - hints, errors [11:17am]Boris_G:Pausing playback when input is expected [11:18am]Boris_G:re-start when input is complete (or time limit expires) [11:18am]Boris_G:Client-side validation -- supported by modern form frameworks. [11:19am]Boris_G:Eg entry of numbers vs. text - client should validate [11:19am]Boris_G:Required vs. optional fields; and more complex things like fields that are required in certain circumstances [11:19am]Boris_G:The validation will need to play well with fallbacks [11:20am]Boris_G:[Skipping down a bit in the document to more related items] [11:20am]Boris_G:Input timestamps - could be useful as part of submission format [11:21am]Boris_G:Wall clock time, not in re SMIL timeline [11:21am]Boris_G:Compare to metadata on document - time/date [11:22am]Boris_G:Relatiionship of forms to profiles. We're imagining forms is a Feature. [11:23am]Boris_G:Eg audio-only profile would probably not allow for image or video input [11:23am]Boris_G:[scrolling up to "Math Navigation"] [11:24am]Boris_G:[@Markus will move this item under Text Media CDI] [11:25am]Boris_G:SMIL needs to be able to point into MathML subcomponents - for fine-grained navigation [11:26am]Boris_G:Annotations. Not part of forms, but related since both deal with user input [11:26am]Boris_G:Today's DAISY bookmarks file - allows for highlighting, notes on sections. [11:27am]Boris_G:Want to be able to save, send, overlay annotations from multiple people [11:27am]Boris_G:A user-agent feature really [11:27am]Boris_G:Combine this with previous entry on improving bookmark DTD [11:27am]Boris_G:Allow all media types as valid annotations [11:28am]Boris_G:Current Bookmark DTD allows text and audio [11:28am]Boris_G:Extension - to comment on an existing annotation - add a reference to other annotation [11:29am]Boris_G:Or, refer to a submitted answer to a question (eg, teachers feedback) [11:30am]Boris_G:[@Marisa or Boris to move annotations items to combine with existing bookmarks item] [11:30am]Boris_G:Let's rename bookmark feature to 'annotations' [11:30am]Boris_G:Annotations may become a product -- eg a professors notes on Hamlet [11:30am]Boris_G:commercial annotations on a public domain text [11:31am]Boris_G:Interest in this for EPub and DAISY [11:31am]Boris_G:George has 400 questions... [11:32am]Boris_G:Think about one Question. What does this look like? [11:33am]Boris_G:There may be a math equation - up to the player how to render some math [11:33am]Boris_G:Now an authoring issue - could mark up detailed navigation [11:36am]Boris_G:Math rendering should be the same in context of a test question or in general reading [11:37am]Boris_G:Handling of math may be a differentiating feature of players [11:37am]Boris_G:Same issue with tables - special table-reading mode is helpful, implemented by some players [11:41am]Boris_G:For math formula navigation, may be better to leave it to user agents to go into text document and generate synthetic speech of any desired component [11:43am]Boris_G:Even in reading a regular text sentence - you may want to pick it apart and look at sub-pieces. [11:44am]Boris_G:Current user agents just play the SMIL - don't normally go into the text document to allow this sort of richer reading experience. [11:45am]Boris_G:How do we encourage development of options? [11:45am]Boris_G:User agent developers may be hesitant to trust SMIL-refs, since they are not there in older DAISY books [11:46am]Boris_G:Q about forms. [11:46am]Boris_G:Looking at standard set of widgets, plus plugins for things like math [11:48am]Boris_G:User agent is free to use whatever widgets it has to render the form controls [11:50am]Boris_G:Some, but not all user agents will use a browser component to render text part of DAISY book. This would come with its own form controls. [11:50am]Boris_G:But this is user agent developers' choice [11:51am]Boris_G:Q: concern about security question [11:52am]Boris_G:Could submission mechanism be abused to, eg, track readers of a non-test book? [11:54am]Boris_G:Encourage user-agent developers to get explicit user approval before posting data [11:54am]kcreasy:If I wanted to spy I probably wouldn't use the formal submission model. [11:54am]Boris_G:Risk has been there previously - eg, externally-referenced images [11:56am]Boris_G:Servers should be identified with certificates - so you know where you are sending your info [11:58am]Boris_G:Q: some workbooks will have an answer key [11:58am]Boris_G:Would be convenient to connect answers with questions [11:59am]Boris_G:In Braille production, may want to move answers so that they are in the same volume with quesions [11:59am]Boris_G:Maybe similar to note / noteref [12:01pm]Boris_G:Q: do we want to specify a format for submissions, so that they would be compatible between players [12:02pm]Boris_G:Similar to bookmarks schema, but not necessarily the same [12:04pm]Boris_G:Want the ability to share annotations with others; at your own choice [12:05pm]Boris_G:Answers to questions - for something like a test, the assumption is these are shared, but user should still have knowledge/control over where information is going [12:05pm]Boris_G:Could annotations and question answers have the same schema? may be too early to tell [12:06pm]Boris_G:[After moving annotations items out, this section of spec will be renamed "Forms"] [12:07pm]Boris_G:xx [goals document, not spec] [12:08pm]Boris_G:After lunch - will check in with requirements database to see if we've forgotten anything [12:09pm]Boris_G:Then start to think about strawman creation [12:09pm] kcreasy left the chat room. (leaving) [12:10pm]Boris_G:Big question about spec - do we need an additional component in fileset - general data about streams of information and how they relate - to keep some complexity out of SMIL. [12:10pm]Boris_G:[breaking for lunch for 1 hour]
