Minutes: 111009 Morning

From zedwiki

Jump to: navigation, search

IRC Live Notes - Tuesday Morning

[09:10am]Boris_G:Tuesday morning: going through results of yesterday's small group sessions
[09:10am]Boris_G:Video support group
[09:10am] Romain joined the chat room.
[09:11am] cwal joined the chat room.
[09:12am]Boris_G:See http://www.digitaltalkingbook.com/zw/ZedDist_Design_Goals for actual requirements generated
[09:12am]Boris_G:Overview of use cases
[09:13am]Boris_G:First use case: video is master -- the publication is based around a video
[09:13am]reubenfir:http://www.digitaltalkingbook.com/zw/ZedDist_Video_UseCases_Requirements#Overview_of_Use-Cases
[09:13am]Boris_G:text for captions and/or transcript
[09:14am]Boris_G:audio can be used for audible description (with text alt)  - need to pause the video for this if there isn't a long-enough silent gap
[09:15am]Boris_G:Use case: sign language / lip-reading is master
[09:16am]Boris_G:text channel can syncronize with this (but note with sign lanaguage text would be a translation, may not synchronize easily - no 1-to-1 mapping)
[09:17am] JoshA joined the chat room.
[09:17am]Boris_G:Eg, may be highlighting discontiguous chunks of text which all map to single sign-language expression
[09:19am]Boris_G:May have sign language and lip reading video - two simultaneous video channels.  Users generally would choose one or the other, but we're not excluding teh possibility that both are displayed
[09:19am] JoshA left the chat room. (Connection closed)
[09:20am] cwal left the chat room. (Read error: Operation timed out)
[09:21am] Romain_ joined the chat room.
[09:21am] Romain left the chat room. (Quit: http://chat.efnet.org )
[09:21am] Romain_ is now known as Romain.
[09:21am]Boris_G:3rd use case: video is secondary - eg short videos interspersed in a text/audio daisy book
[09:22am]Boris_G:eg disaster preparedness information -- illustrate key ideas with video, animation
[09:23am]Boris_G:Video can be displayed in separate window, in separate region, or in-line in text.  In-line presentation should not be required of players
[09:23am] KennyJoha joined the chat room.
[09:24am]Boris_G:Including video inside text's visual layout would put timed information into an otherwise static space.
[09:25am]Boris_G:Avoid ambiguity in terms of who is in charge of the timing - SMIL should be in charge
[09:26am]KennyJoha:Speech fully disabled.
[09:27am]Boris_G:So, use case 1 = accessible motion picture; starts with video as primary asset
[09:27am]Boris_G:use case 2 = a sign-language book or similar publication
[09:29am]Boris_G:There will be other scenarios, but these use cases should generate requirements that allow other possibilities
[09:30am]Boris_G:eg, could have a video + a video track of speaker's face just for lip reading
[09:31am]Boris_G:How effective is lip reading compared to sign language?  -- depends on the person
[09:32am]Boris_G:In a filmed play, might need to get someone to re-voice since faces can't necessarily be seen
[09:33am]Boris_G:Group that needs lip reading is not that big, but there are people who need it
[09:33am] kcreasy joined the chat room.
[09:34am]Boris_G:There is a technique of creating a synthetic face - could be something to try
[09:34am]Boris_G:IBM product
[09:35am]Boris_G:Could also be useful as training materials for learning sign language or lip reading
[09:36am]Boris_G:http://www.digitaltalkingbook.com/zw/ZedDist_Design_Goals#Video_Support
[09:36am]Boris_G:*Requirements*
[09:36am]Boris_G:Need to define at least one video codec that user-agents must implement
[09:37am]Boris_G:Likely to lead to heated discussions!
[09:38am]Boris_G:2.1 accurate frame-based video positioning - must have precise links to time points
[09:38am]Boris_G:sign lang. can be very fast
[09:38am]Boris_G:Link It does this with frame-based notation -- but this depends on the codec
[09:39am]Boris_G:2.2 Accessible Labeling
[09:40am]Boris_G:Q: has problem of frame-based video positioning been solved before?
[09:41am]Boris_G:SMIL uses NTP  "Normal Presentation Time"
[09:41am]Boris_G:xx NPT
[09:41am]Boris_G:We're not aware of what extensions / other possibilities there may be
[09:42am]Boris_G:2.2 Accessible Labeling
[09:42am]Boris_G:Associate multimedia labels with sections of the document -- like DAISY 3 resource file but extend to more media types
[09:44am]Boris_G:After 7 years, nobody really using resource file
[09:44am]Boris_G:Should we get rid of it?  Something is wrong...
[09:45am]Boris_G:SMIL 3 has "label" attribute for this purpose
[09:45am]kcreasy:roducers use only as much DAISY as it takes to get the job done. Resource files are not required and are extra work.
[09:46am]Boris_G:label has URL fragment - can point to any media object
[09:47am]Boris_G:Simpler for players, since information is all local rather than indirected through resource file
[09:47am]Boris_G:3.1 Volume Control
[09:48am]Boris_G:Author can define default levels for individual streams of audio
[09:49am]Boris_G:Users should be able to adjust individual track levels as well
[09:50am]Boris_G:You may not be able to modify video assets you are republishing - so need to adjust volume in SMIL
[09:50am]Boris_G:Need to label tracks so that users can know what they are & modify them as they need
[09:52am]Boris_G:Semantic values of what the various tracks are *for* -- something higher level than SMIL
[09:52am]Boris_G:3.2 Speed and pitch control
[09:52am]Boris_G:This would be adjusted for presentation as a whole, not individual tracks since they are synchronized
[09:53am]Boris_G:SMIL should help ensure consistency of the time graph
[09:54am]Boris_G:Finding good components to adjust speed of audio has been a difficulty for implementors
[09:54am]Boris_G:will this be even harder with video?
[09:54am]Boris_G:(the only easy method for audio is the "Donald Duck" method...)
[09:55am]Boris_G:HTML5 has media rate control;  DirectX / Windows Media Player may support this out of the box
[09:55am]Boris_G:3.3 Layout and positioning control
[09:56am]Boris_G:Presentation layouts defined by author; customizable by user
[09:57am]Boris_G:Semantic labels on the different streams will be helpful here
[09:58am]Boris_G:Make sure videos are compressed in a way so that they can be shown at a reasonable size - to see face & hands, etc
[09:58am]Boris_G:Adapt to playback on mobile phone vs. desktop syste
[09:59am]Boris_G:xx system
[09:59am]Boris_G:May need alternate renditions of same video track
[09:59am]Boris_G:Different versions embedded within same container [views], or different editions for different devices?
[10:00am]Boris_G:Same problem as editions with / without images.
[10:00am]Boris_G:Spec doesn't have to talk too much about this; will be worked out by distributors
[10:01am]Boris_G:Would speed up of sign language track potentially change the meaning?
[10:01am]Boris_G:No -- but could definitely hinder understanding
[10:02am]Boris_G:Depends on how it's filmed.  If you are speaking very clearly, you can change the speed.  But there's a limit to how fast or how slow you can go.
[10:02am]Boris_G:If too slow, lose the thread of the message
[10:03am]Boris_G:In audio, 2x normal speed is relatively common
[10:03am]Boris_G:Haven't really experimented with this in sign language, but depends on the signer & how clear they are
[10:04am]Boris_G:Best if the user can adjust and see what works in each case
[10:04am]Boris_G:Sampling rate on the video must be sufficient to enable speedup
[10:05am]Boris_G:Image quality will matter too -- lighting, etc
[10:06am]Boris_G:Consider separate windows / separate screen presentation.  Multiple screens will become more common.
[10:06am]Boris_G:3.4 Pause / Resume
[10:06am] danielwec left the chat room. (Ping timeout: 240 seconds)
[10:06am]Boris_G:One video stream may need to be paused / padded to allow another stream to complete (eg, audio description)
[10:07am]Boris_G:SMIL has mechanisms, but not sure about complexity, implementation
[10:08am]Boris_G:Many potential ways to do this, but we should dictate one way to do it.
[10:08am] danielwec joined the chat room.
[10:08am]Boris_G:3.5 Channel / Track selection
[10:08am]Boris_G:Channel can be turned on/off by user - eg turn captions, audio descriptions off
[10:09am]Boris_G:Q: can there be multiple SMIL documents?  Yes, but currently one "logical" document broken up into time segments
[10:10am]Boris_G:Maybe there could be alternative documents to implement different views?
[10:10am]Boris_G:Would be difficult to change from one view to another mid-stream in that case
[10:11am]Boris_G:Common practice to have 1 smil file per chapter - or other conventions in different organizations
[10:11am]KennyJoha:Speech fully enabled.
[10:11am]Boris_G:Note to us:  consider recommending a guideline here
[10:11am]Boris_G:Large SMIL files are problematic for players
[10:13am]Boris_G:Don't want to try having a SMIL file for each permutation of channels on / off - could be a lot of combinations
[10:13am]Boris_G:Could we have a higher-level language which is used to generate a SMIL document on the client side
[10:14am]Boris_G:May want to build, eg audio-only editions for a player that has no video capabilities - download much smaller document
[10:16am] cwal joined the chat room.
[10:16am] cwal left the chat room. (Client Quit)
[10:16am] cwal joined the chat room.
[10:18am]Boris_G:For turning tracks on / off - would you use SMIL CustomTest?  No simple way to do this.
[10:18am]Boris_G:SMIL doesn't have a concept of a channel - just little clips
[10:19am]KennyJoha:3/clear
[10:19am]Boris_G:Directive is that authors ned to be able to label tracks in terms of their role - so user can choose
[10:19am]Boris_G:4.1 Semantic annotations
[10:20am]Boris_G:annotating between media types - "this is an audio description of that"
[10:21am]Boris_G:Questions
[10:21am]Boris_G:Q: How evilly complex would a SMIL engine be to support all this?
[10:22am]Boris_G:AMIS - a real SMIL engine
[10:22am] KennyJoha is now known as kennyjoha.
[10:22am]Boris_G:But most DAISY players just implement a small subset.
[10:22am]Boris_G:Eg, for text-only you can ignore SMIL altogether
[10:23am]Boris_G:Tricky part is juggling different media being in charge of the timing.  Video per se doesn't make a big difference.
[10:23am]Boris_G:But if user can turn of a track and that changes who is in charge of timing, that is tricky
[10:24am]Boris_G:Timing Events - Syncing and slipping - this is very hard.
[10:25am] cwal left the chat room. (Ping timeout: 480 seconds)
[10:25am]Boris_G:Lip reading video & audio, if in separate tracks - need absolutely precise synch
[10:25am]Boris_G:Need fine-grained control, can't just hand off to separate playback engine
[10:27am]Boris_G:Do we need a way to break out of SMIL in some cases?  Sometimes declarative markup is harder than creating some active code
[10:27am]Boris_G:Link-it uses something more like style sheets
[10:28am]Boris_G:SMIL based on strict containment - nested par & seq elements
[10:28am]Boris_G:Makes overlapping things hard
[10:28am]Boris_G:Might be easier to have something like a database of facts / metadata about the various assets
[10:29am]Boris_G:"This video range and this text range are associated"
[10:30am]Boris_G:Maybe some sort of timesheet - shows channels each on a line, and lists points of correspondence
[10:30am] mgylling left the chat room. (Quit: mgylling)
[10:31am]Boris_G:There may not be any player / user that actually uses all track of information.
[10:31am]Boris_G:How can we make it easy to flatten SMIL easily into a simpler timeline
[10:32am]kennyjoha:Speech fully disabled.
[10:33am]Boris_G:Is there a distinction between authoring and distribution here?
[10:33am]Boris_G:Maybe SMIL is relevant at one point in the process.
[10:33am]Boris_G:Thanks to video group for tremendous work!
[10:33am]Boris_G:[15 minute break starting]

kennyjoha is now known as kenny_j.
[10:41am] danielwec left the chat room. (danielwec)
[10:50am] cwal joined the chat room.
[10:52am] cwal left the chat room.
[10:52am] mgylling joined the chat room.
[10:52am] mgylling_ joined the chat room.
[10:57am] danielwec joined the chat room.
[10:57am]Boris_G:Marisa: interactivity-related requirements
[10:58am]Boris_G:DAISY 4 wants to support educational content -- testing, quizzes.  Also things like tax forms, order forms
[10:58am]Boris_G:Requirement: timing in tests
[10:59am]Boris_G:Generally user agent can override any timing in document; though tests want to specify time constraints
[10:59am]Boris_G:Req: dynamic content flow.
[10:59am]Boris_G:Subsequent content can depend on answers to questions.
[11:00am]Boris_G:A branching tree of questions
[11:01am]Boris_G:Risk of creating really confusing navigation model -
[11:01am]Boris_G:What are implications for NCX?  "Section 2" is not a single, static thing?
[11:02am]Boris_G:Extreme example - "who wants to be a millionaire" implemented as DVD menus
[11:03am]Boris_G:May be better to allow this to be a specialized application that someone builds, rather than us adding the logic in to general documents
[11:03am]Boris_G:Req: security and encryption
[11:03am]Boris_G:Test require prevention of unauhorized access - documents and answers submitted
[11:04am]Boris_G:Req: Locking portions of exam - no going back to chnage answers in earlier sections
[11:04am]Boris_G:Though this would conflict with normal navigation model
[11:05am]Boris_G:Req: accepting input from users - a submission model
[11:05am]Boris_G:Want to store answers in a reasonably flexible format to enable various use cases.
[11:06am]Boris_G:(but does this conflict with security?)
[11:06am] Kenny-j joined the chat room.
[11:06am]Boris_G:Would prefer not requiring submit button - have information saved automatically
[11:07am]Boris_G:Req: incremental submissions.  Not necessarily always saving a complete set of answers.
[11:07am]Boris_G:Req: group contextually-related components.  Question, answer, perhaps also data on which the question is based
[11:08am]Boris_G:Ability to point from submitted data back to question.
[11:08am] kenny_j left the chat room.
[11:08am]Boris_G:Loading submissions - should be able to load up a set of answers.  Could have multiple sets of answers to same form (eg my taxes, your taxes)
[11:09am]Boris_G:Extensibility of submission format - a flexible format should support things like aggregating answers of different users.  Details of how to do this not part of daisy spec.
[11:10am]Boris_G:Types of input.
[11:10am]Boris_G:Typical input types: multiple choice, text, numeric range
[11:11am]Boris_G:"Two-way accessibility" -- if we have sign language books, should accept sign language input
[11:11am]Boris_G:Common form languages don't have eg a video input type, however.
[11:12am]Boris_G:Note: just talking about raw video submission - not machine-translated sign language!
[11:12am]Boris_G:Mathematical input - in math test, need ability to input some math as answer
[11:13am]Boris_G:Needed to support higher level math testing - but this is definitely complex
[11:16am]Boris_G:There are various math input user interfaces - wondering if there are accessible ones
[11:16am]Boris_G:Will need things like echoing of inputted text
[11:17am]Boris_G:user agents will want to link with existing assistive technology
[11:17am]Boris_G:Forms and presentation flow -- will be different than flow of, say, a novel
[11:17am]Boris_G:Things like feedback on inputs - hints, errors
[11:17am]Boris_G:Pausing playback when input is expected
[11:18am]Boris_G:re-start when input is complete (or time limit expires)
[11:18am]Boris_G:Client-side validation -- supported by modern form frameworks.
[11:19am]Boris_G:Eg entry of numbers vs. text - client should validate
[11:19am]Boris_G:Required vs. optional fields; and more complex things like fields that are required in certain circumstances
[11:19am]Boris_G:The validation will need to play well with fallbacks
[11:20am]Boris_G:[Skipping down a bit in the document to more related items]
[11:20am]Boris_G:Input timestamps - could be useful as part of submission format
[11:21am]Boris_G:Wall clock time, not in re SMIL timeline
[11:21am]Boris_G:Compare to metadata on document - time/date
[11:22am]Boris_G:Relatiionship of forms to profiles.  We're imagining forms is a Feature.
[11:23am]Boris_G:Eg audio-only profile would probably not allow for image or video input
[11:23am]Boris_G:[scrolling up to "Math Navigation"]
[11:24am]Boris_G:[@Markus will move this item under Text Media CDI]
[11:25am]Boris_G:SMIL needs to be able to point into MathML subcomponents - for fine-grained navigation
[11:26am]Boris_G:Annotations.  Not part of forms, but related since both deal with user input
[11:26am]Boris_G:Today's DAISY bookmarks file - allows for highlighting, notes on sections.
[11:27am]Boris_G:Want to be able to save, send, overlay annotations from multiple people
[11:27am]Boris_G:A user-agent feature really
[11:27am]Boris_G:Combine this with previous entry on improving bookmark DTD
[11:27am]Boris_G:Allow all media types as valid annotations
[11:28am]Boris_G:Current Bookmark DTD allows text and audio
[11:28am]Boris_G:Extension - to comment on an existing annotation - add a reference to other annotation
[11:29am]Boris_G:Or, refer to a submitted answer to a question (eg, teachers feedback)
[11:30am]Boris_G:[@Marisa or Boris to move annotations items to combine with existing bookmarks item]
[11:30am]Boris_G:Let's rename bookmark feature to 'annotations'
[11:30am]Boris_G:Annotations may become a product -- eg a professors notes on Hamlet
[11:30am]Boris_G:commercial annotations on a public domain text
[11:31am]Boris_G:Interest in this for EPub and DAISY
[11:31am]Boris_G:George has 400 questions...
[11:32am]Boris_G:Think about one Question.  What does this look like?
[11:33am]Boris_G:There may be a math equation - up to the player how to render some math
[11:33am]Boris_G:Now an authoring issue - could mark up detailed navigation
[11:36am]Boris_G:Math rendering should be the same in context of a test question or in general reading
[11:37am]Boris_G:Handling of math may be a differentiating feature of players
[11:37am]Boris_G:Same issue with tables - special table-reading mode is helpful, implemented by some players
[11:41am]Boris_G:For math formula navigation, may be better to leave it to user agents to go into text document and generate synthetic speech of any desired component
[11:43am]Boris_G:Even in reading a regular text sentence - you may want to pick it apart and look at sub-pieces. 
[11:44am]Boris_G:Current user agents just play the SMIL - don't normally go into the text document to allow this sort of richer reading experience.
[11:45am]Boris_G:How do we encourage development of options?
[11:45am]Boris_G:User agent developers may be hesitant to trust SMIL-refs, since they are not there in older DAISY books
[11:46am]Boris_G:Q about forms.
[11:46am]Boris_G:Looking at standard set of widgets, plus plugins for things like math
[11:48am]Boris_G:User agent is free to use whatever widgets it has to render the form controls
[11:50am]Boris_G:Some, but not all user agents will use a browser component to render text part of DAISY book.  This would come with its own form controls.
[11:50am]Boris_G:But this is user agent developers' choice
[11:51am]Boris_G:Q: concern about security question
[11:52am]Boris_G:Could submission mechanism be abused to, eg, track readers of a non-test book?
[11:54am]Boris_G:Encourage user-agent developers to get explicit user approval before posting data
[11:54am]kcreasy:If I wanted to spy I probably wouldn't use the formal submission model.
[11:54am]Boris_G:Risk has been there previously - eg, externally-referenced images
[11:56am]Boris_G:Servers should be identified with certificates - so you know where you are sending your info
[11:58am]Boris_G:Q: some workbooks will have an answer key
[11:58am]Boris_G:Would be convenient to connect answers with questions
[11:59am]Boris_G:In Braille production, may want to move answers so that they are in the same volume with quesions
[11:59am]Boris_G:Maybe similar to note / noteref
[12:01pm]Boris_G:Q: do we want to specify a format for submissions, so that they would be compatible between players
[12:02pm]Boris_G:Similar to bookmarks schema, but not necessarily the same
[12:04pm]Boris_G:Want the ability to share annotations with others; at your own choice
[12:05pm]Boris_G:Answers to questions - for something like a test, the assumption is these are shared, but user should still have knowledge/control over where information is going
[12:05pm]Boris_G:Could annotations and question answers have the same schema?  may be too early to tell
[12:06pm]Boris_G:[After moving annotations items out, this section of spec will be renamed "Forms"]
[12:07pm]Boris_G:xx [goals document, not spec]
[12:08pm]Boris_G:After lunch - will check in with requirements database to see if we've forgotten anything
[12:09pm]Boris_G:Then start to think about strawman creation
[12:09pm] kcreasy left the chat room. (leaving)
[12:10pm]Boris_G:Big question about spec - do we need an additional component in fileset - general data about streams of information and how they relate - to keep some complexity out of SMIL.
[12:10pm]Boris_G:[breaking for lunch for 1 hour]

Personal tools