Technical Requirements and Guidelines for Videos

Videos are an essential component of your work that can provide viewers with additional information in rich detail, can be used as publicity material to give it broader exposure, or can serve as an alternative to a live presentation. They are often featured both on the ACM SIGCHI YouTube channel and on the ACM Digital Library. This page outlines a set of technical requirements for all video submissions. It also provides guidelines and step-by-step instructions for creating videos that meet these requirements.

  1. Video Categories
  2. Key Requirements
  3. Guidelines
    1. Video Recording
    2. File Size Limits
    3. Closed Captions
  4. Frequently Asked Questions (FAQs)
    1. Can I use third-party material and what other copyright issues I should consider?
    2. What accessibility considerations should I pay attention to when recording my video?
    3. How do I record a picture-in-picture video?
    4. How do I re-encode my video to reduce its file size?
    5. How do I create time-stamped caption files in a suitable format?
  5. Additional Resources

Video Categories

  1. Video Figures: These are submitted as supplementary material with your submission, and are used to illustrate key aspects of the work. They are typically up to 5 minutes in duration, and are archived in the ACM Digital Library.
  2. Video Previews: These are typically a 30-second summary or “elevator pitch” of your publication. Only accepted publications are invited to submit a video preview. These are published on Youtube to help attendees plan their conference. Video Previews also appear in the ACM Digital Library next to your publication’s abstract. Selected Video Previews are featured in the conference Teaser Video.
  3. Video Presentations: These are a pre-recorded version of your conference talk, used for virtual or hybrid conference formats. Only accepted publications are invited to submit a video presentation. The duration varies by conference and technical track (refer to your PCS submission page for details). Video Presentations are typically played back during conference sessions, and also appear on YouTube and later under Supplementary Material in the ACM Digital Library.

Key Requirements

  • Videos should be recorded and submitted in a typical video format and resolution (MP4, M4V, MOV, AVI). 
  • Videos should strictly follow the duration limit that is specified. Videos that exceed the limit might automatically be trimmed.
  • Speech should be clearly audible and text should be readable.
  • Every video should be accompanied by a timestamped closed-caption file in suitable format (SRT, SBV, VTT). If the video does not contain any audio, a caption file with an informative description (e.g. [This video contains no audio] ) should be included. Captions should not be burnt into the video (i.e. no open-captions).
  • Additional requirements (e.g. duration, file size) might be specified by individual venues, and can be found on the PCS submission page or conference website.

Guidelines

Video Recording

  • Opening title: It is recommended to include the title at the start of the video.
  • Author information: If a video needs to be anonymised during the submission stage, it should not contain any author or affiliation information (logos, footers, credits, etc.). Any author’s face appearing in the video must be made unidentifiable (by adjusting camera’s viewpoint or applying video filters).
  • Third-party material and copyright: It is very important that you have the rights to use all the material that is contained in your submission, including music, video, images, etc. (See FAQs below for more information and tips).
  • Accessibility of videos: Pay attention to details such as flashing lights, unsteady camera work, and loud sounds to increase the accessibility of your video (see FAQs below for more details).
  • Recording videos with slides or animations only: Typical presentation softwares (Powerpoint, Keynote) allow recording presentations with microphone audio.
  • Recording videos with picture-in-picture speaker view: You can either separately capture the speaker’s video and insert it using video editing software, or you can use Zoom, Loom, or similar software to directly record a picture-in-picture video (See FAQs below for more information and tips).
  • Sanity check: Please ensure that content is appropriate in terms of rights and taste, does not contain inappropriate language, viewpoints or imagery and is unlikely to cause offence to any individuals or groups either present at the conference or beyond.

File Size Limits

Depending on the video category, there may be different file size limits for video submissions. For example, the maximum file size for a Video Presentation is 300 MB, and PCS will not allow submissions larger than this limit. When you record your video, depending on the software and format used, the file size might exceed the specified limit. You can re-encode your video using freely available software to reduce it and ensure that it meets requirements.

  • Handbrake is freely available for Windows, MacOS, and Linux. Use the “General → Fast 1080p30” option for re-encoding your video.
  • Other alternatives like  Freemake Video Converter and Miro Video Converter can be used as well.
  • FFmpeg is a well-known transcoding solution that can be used as a command-line alternative.

See FAQs below for full instructions.

* SIGCHI does not endorse, and is not responsible for, the use of any of the software mentioned in this guide.

Closed Captions

Every video should be accompanied by a closed-caption file with timing in an appropriate file format (SRT, SBV, or VTT). Captions are invaluable for many of us, for different reasons.  Some people cannot hear the audio at all, and the captions convey what is being said. Others use captions to supplement and reinforce what they hear. If a viewer has difficulty understanding a speaker, or didn’t catch an important name, captions help. If you are in a noisy environment, or need to be listening for a baby’s cry, captions help. For our international audience, captions improve language access. 
Here are two examples of closed captioning done well: Example 1Example 2.

There are two common approaches for creating closed captions:

  • Convert a transcript to a caption file: This is the method we primarily recommend for accurate captions. Here, you first need to create a line-by-line transcript of your video in plain text. Include additional information (e.g. background music) using brackets (“[Background music playing]”). You can then use YouTube to automatically add timestamps and generate the final caption file. 
  • Automated captions: If you can not prepare a transcript, you make use of automated captioning tools (e.g. YouTube, otter.ai) to generate caption files. However, since these tools rely on AI and speech recognition to generate the text, they can contain mistakes or errors (e.g. technical terms, names). It is important to check the generated caption file using any text editing tool to verify and manually correct any mistakes.

See the FAQ section below for full instructions.

Frequently Asked Questions (FAQs)

Authors retain copyright to their videos, but ACM requires that you sign an agreement allowing ACM to distribute the material. This is typically a part of the “ACM Rights Review” form you sign when finalising your publication.

It is very important that you have the rights to use all the material that is contained in your submission, including music, video, images, etc. Attaining permissions to use video, audio, or pictures of identifiable people or proprietary content rests with the author, not the ACM or SIGCHI.

If you need to use copyrighted protected work, you are required to review and comply to ACM’s Copyright and Permission Policy and ACM’s Requirements about 3rd party material. Remember that even if you declare that you have used third-party material, if your video violates any fair use guidelines, it may be partly or entirely blocked, or permanently removed.

Video Previews and Video Presentations are often uploaded to YouTube. YouTube will show advertisements within videos that contain monetized audio content, regardless of what copyright is associated with that content. To ensure that your video can show without advertisements, we recommend that you upload your video to YouTube in advance before submission, set it to “private” or “unlisted”, and check whether it is used for monetization.

You are encouraged to use Creative Commons content, for example music available at ccMixter or Newgrounds. YouTube’s copyright education website provides useful information on reusing 3rd party material. 

Note: You will be asked to confirm that you agree with these policies on the final submission form.

What accessibility considerations should I pay attention to when recording my video?

SIGCHI members represent a diverse group and there may be attendees who are blind or have other vision impairments (e.g., low vision and impaired color vision). Many are not native speakers of English. Some attendees may rely on lip reading, may have difficulty reading words on the slides, or may be sensitive to animations or flashing lights. Therefore, it is important that you carefully design all aspects of your video so that every member is able to access the information you are sharing.

You can use the tips for creating an accessible presentation provided in this 5 minute video. Remember that some people will not be able to see your visuals, so the presentation should be understandable from the script alone – if visuals are important you should verbally describe them. 

Please avoid using effects in your video that could trigger an adverse reaction. For example, flashing lights can induce seizures for people with photosensitive epilepsy. Avoid using animations (simple appear/disappear is ok), unsteady camera work, flashing strobe lights, loud sounds, or repetitive alarms. If you include components, such as police car lights and sirens, consider warning viewers at the start of the video or right before the content so they can look away or mute their computers. The Trace Center offers an analysis tool to help authors assess their video is safe for people with photosensitive epilepsy (https://trace.umd.edu/peat/).

Below are additional recommendations about the three components of video content: script, visuals and audio.

Script:

  1. Include all important information – don’t assume everyone can see the visuals. 
  2. Describe images and charts
  3. Avoid using slang and colloquialisms – use simple direct language.
  4. Avoid pointing and saying “as you can see …” or “… here” without giving additional information verbally
  5. If your visuals need more description than can be included in the script, consider providing an audio described version of the video, or give a link to a written description.

Visuals:

  1. Remember that viewers may have captions showing on the bottom part of the screen and avoid using that area for important information.
  2. Use a color scheme with good contrast
  3. Avoid small text 
  4. Use more than just color to communicate information. 
  5. Avoid animations and visual effects that could trigger an adverse reaction. For example, flashing lights can induce seizures for people with photosensitive epilepsy. Avoid unsteady camera work and flashing strobe lights. If you include such components, warn viewers before this content so they can look away. 

Audio:

  1. Make sure speech is audible and not muffled by loud background music.
  2. Avoid loud sounds, or repetitive alarms that could trigger an adverse reaction.
  3. If you include components such as police car sirens, warn viewers before this content so they can mute their computers.
  4. If you want to add in a bit of background audio (with cleared rights), many video editors allow you to do that (maybe search online for “how to use audio ducking” for your editor).

How do I record a picture-in-picture video?

If you already know how to use professional video editing tools (e.g. Adobe Premiere, Final Cut Pro X), you probably don’t need further instructions. Remember that a picture-in-picture video is not the only way you can include both slides (or other material) and the speaker view; you could get creative and try using a green screen to blend the content and the speaker view.

However, there are also very simple ways to go about recording a video that has both content (slides, animations, etc.) and the speaker’s video. 

Using Zoom: This is a fast and user-friendly way of creating your video. 

  • Start a new meeting
  • Include your slides using screen-sharing 
  • Enable the video to add the picture-in-picture view
  • Click on “Record” to create a new video recording
  • Stop the recording and end the meeting. The recording will be encoded to MP4 and saved to a Zoom folder.
  • The final recording might need to be re-encoded to reduce the file size.

Using Loom: This is another good alternative. It can run inside a browser or on your computer (MacOS or Windows) more broadly. It is a pretty straightforward tool with plenty of onboarding. It does try to cloud store your video when you’re done but you can easily download the video and delete it from their web storage if you like.

Using OBS Studio: This is a more advanced option, but also allows for more customizability. For example, it allows you to specify the position and size of the speaker view, or create a side-by-side video instead. You can find several guides and tutorials for recording videos with OBS Studio online.

How do I re-encode my video to reduce its file size?

Handbrake is a cross-platform tool that can be used to quickly re-encode any video to a standard MP4 format.

  1. Add your video to Handbrake: Use the “Open Source” button or menu item,  or drag-and-drop the video file into the main window.
  2. Select the preset “General → Fast 1080p30” (see screenshot).
  3. Press the “Start” button. 
  4. The re-encoded video will be saved as an MP4 file in the specified destination directory.

As a command-line alternative, you can use FFMPEG to re-encode your video. This simple command should do the trick:

ffmpeg -I <original_path_to_video> <final_path_without_extension>.mp4

How do I create time-stamped caption files in a suitable format?

You can use YouTube Studio to create a time-stamped caption file in an appropriate format (SRT, SBV, or VTT). Alternatively, you can use other tools such as otter.ai that provide captioning features.

Using YouTube Studio:

  1. Log in to YouTube Studio
  2. Upload your video using the “Create” button, and set it’s visibility to ‘private’.
  3. Generate your caption file:
    1. Using a transcript:
      1. Upload the text file containing a plain text transcript.
      2. YouTube will automatically add timing information.
      3. Select your video to edit it.
      4. Go to the “Subtitles” view.
      5. Download the subtitle in SRT format using the options next to “Duplicate and Edit”
    2. Using automated captioning:
      1. After you finish uploading your video, YouTube will automatically generate subtitles for it (ensure that the language is set to “English”).
      2. Once these are ready, follow steps 3 to 5 as above.
      3. After downloading the generated file, use any text editor to manually fix any errors or mistakes. Pay attention to technical terms, abbreviations, and proper nouns.
      4. Save the final SRT caption file and upload this with your video.

Using Otter.ai:

This is a handy tool for automatically creating transcripts, and typically has good accuracy. The free version allows you to only export to text (.TXT) files, while the paid versions allow exporting time-coded captions in other formats (e.g. .SRT).

  1. Create a free otter.ai account and log in.
  2. Upload your video using the “Import” button.
  3. Once the video is uploaded, the captions will be automatically generated.
  4. Select the “My Conversations” tab, and then select your video from the list.
  5. The transcript will be displayed. Check the text for errors, and make any required corrections by editing directly. Pay attention to technical terms, abbreviations, and proper nouns.
  6. Click on the ellipsis (⋯) next to the “Edit” button, and select the “Export text” option.
  7. Free version:
    1. Choose TXT as the “Export Format” and turn off all the options shown (speaker names, timestamps, etc.).
    2. Download the transcript file.
    3. To add timestamps and create a caption file (in SRT/SBV/VTT) now, you can use YouTube Studio and follow the steps above under “Using a Transcript”.
  8. Pro version: Choose SRT as the “Export format” and click “Continue” to download the SRT caption file. Upload this file along with your video.

If you have any suggestions or questions regarding these requirements and guidelines, feel free to drop the SIGCHI Video Operations team an email: sigchi-video@acm.org

Additional Resources

The SIGCHI Guide to an Accessible Submission provides detailed instructions on authoring, submitting, and finalising accessible documents.