August
Assistant

Supported File Formats

August supports a range of file formats for document analysis, review, transcription, and drafting workflows. Different formats serve different purposes depending on the feature you're using and the type of work you need to accomplish.

Document Formats

Document formats are the primary inputs for Assistant, Tabular Review, and clause drafting workflows.

Word Documents (.docx)

Microsoft Word documents are the most common format for legal drafting and editing work in August.

  • Best for: Drafts, agreements, and any documents where you need to produce tracked changes or editable outputs.

  • Features: Full text extraction, clause analysis, redlining, and tracked-change exports.

  • Use with: Assistant for analysis and drafting; Tabular Review for comparison; Clause Drafting in Word for inline edits.

PDF (.pdf)

Portable Document Format files are widely used for executed agreements, scanned documents, and finalized materials.

  • Best for: Finalized agreements, executed contracts, and documents where preservation of original formatting matters.

  • Features: Text extraction, OCR processing for scanned documents (including image-only PDFs), clause analysis, and citation-ready outputs.

  • Use with: Assistant for analysis and Q&A; Tabular Review for multi-document comparison.

For best results with scanned PDFs, ensure the document is clearly legible. August applies OCR automatically to extract text from image-based PDFs.

Password-Protected PDFs

August handles password-protected PDFs differently depending on the type of protection:

  • User-password-protected PDFs require a password to open. August will prompt you to enter the password during upload so the document can be processed.

  • Owner-password-protected PDFs (also called permissions-restricted PDFs) restrict editing, printing, or copying but open without a password in most viewers. August uploads these files without prompting for a password.

If a PDF opens normally in Acrobat, Preview, or Chrome without asking for a password, it should upload to August without requiring one—even if the file has usage restrictions applied.

Plain Text (.txt)

Plain text files provide unformatted content for analysis and processing.

  • Best for: Raw text content, code snippets, notes, and any content where formatting is not required.

  • Features: Direct text processing without extraction overhead, suitable for quick analysis.

  • Use with: Assistant for analysis and Q&A.

Markdown and HTML (.md, .html)

Markdown and HTML files are structured text formats commonly used for documentation and web content.

  • Best for: Technical documentation, structured notes, and content with formatting markup.

  • Features: Text extraction with structure preservation where applicable.

  • Use with: Assistant for analysis and content review.

Spreadsheet Formats

Spreadsheet formats support analysis of tabular data, financial schedules, and structured information.

Excel (.xlsx)

Microsoft Excel spreadsheets are the primary spreadsheet format for financial data, schedules, and tabular analysis.

  • Best for: Financial schedules, cap tables, deal matrices, and structured data sets requiring tabular analysis.

  • Features: Cell-level extraction, formula awareness, and structured data analysis.

  • Use with: Assistant for Q&A on spreadsheet content; Tabular Review for structured extraction.

CSV (.csv)

Comma-separated values files provide simple tabular data without formatting or formulas.

  • Best for: Data exports, simple tabular data, and unformatted spreadsheet content.

  • Features: Structured text extraction for tabular analysis.

  • Use with: Assistant for data analysis and Q&A.

Presentation Formats

Presentation formats support analysis of slide decks and visual materials.

PowerPoint (.pptx)

Microsoft PowerPoint files contain slide presentations with text, images, and structured content.

  • Best for: Pitch decks, training materials, and presentations requiring content extraction or summarization.

  • Features: Slide-by-slide text extraction and content analysis.

  • Use with: Assistant for content review, summarization, and Q&A.

Image Formats

Image formats are processed through OCR (optical character recognition) to extract text for analysis.

TIFF (.tif, .tiff)

Tagged Image File Format is common in legal production sets and scanned document collections.

  • Best for: Document productions, scanned legal filings, and archival materials received as image files.

  • Features: OCR processing for text extraction, analysis alongside other document types.

  • Use with: Assistant for Q&A on scanned productions; Tabular Review for extraction across image-based document sets.

Other Image Formats

Additional image formats supported for OCR processing include common image file types used in document productions.

  • Supported formats: JPEG (.jpg, .jpeg), PNG (.png), and other standard image formats.

  • Best for: Single-page document scans, exhibits, and embedded images that need text extraction.

  • Features: OCR processing for text extraction, allowing analysis within Assistant and Tabular Review.

Audio Formats

Audio files are processed through transcription for use in Live Assist.

Supported Audio Types

August supports common audio formats for real-time transcription and post-conversation analysis.

  • Supported formats: MP3 (.mp3), WAV (.wav), M4A (.m4a)

  • Best for: Depositions, client calls, witness interviews, and settlement negotiations where you need a searchable transcript.

  • Features: Real-time transcription with speaker labels, cross-referencing against uploaded documents, and flagging of contradictions with citations.

  • Use with: Live Assist for real-time transcription during live conversations.

Before recording any conversation, confirm you have proper consent under applicable recording and consent laws in your jurisdiction. Compliance with local recording laws is your responsibility.

Email Formats

Email file formats allow you to analyze email productions and communications within August workflows.

EML (.eml)

EML files are standard email message files that preserve message content, headers, and attachments.

  • Best for: Email productions in discovery, internal investigations, and correspondence analysis.

  • Features: Extraction of message body, headers, and attachments; analysis within Assistant; cross-referencing in Live Assist.

  • Use with: Assistant for email analysis and fact extraction; Live Assist for cross-referencing email productions against live testimony; Tabular Review for structured extraction across email sets.

MSG (.msg)

MSG files are Microsoft Outlook message files containing email content, headers, and attachments.

  • Best for: Outlook email exports, discovery productions in MSG format, and correspondence analysis from Microsoft Exchange environments.

  • Features: Extraction of message body, headers, and attachments; analysis within Assistant and Tabular Review.

  • Use with: Assistant for email analysis and fact extraction; Tabular Review for structured extraction across email sets.

Format Summary by Feature

Different August features support different format combinations based on the workflow:

Feature

Supported Formats

Workflow

Assistant

DOCX, PDF, XLSX, CSV, PPTX, TXT, MD, HTML, TIFF, images (JPEG, PNG), EML, MSG

Analysis, Q&A, drafting, research

Tabular Review

DOCX, PDF, XLSX, CSV, TIFF, images, EML, MSG

Structured extraction and comparison

Live Assist

Audio (MP3, WAV, M4A), plus DOCX, PDF, TIFF, EML, MSG for cross-referencing

Real-time transcription and flagging

Clause Drafting

DOCX

Inline drafting and redlining in Word

Choosing the Right Format

When preparing documents for upload, follow these guidelines:

For drafting and revision work

Use DOCX when you need to produce editable outputs or tracked changes. This format preserves formatting for downstream editing and allows August to generate redlines directly.

For analysis and review

Both PDF and DOCX work well for analysis. Use PDF for executed documents and DOCX for drafts you may want to edit later.

For scanned documents and productions

TIFF, JPEG, PNG, and other image formats are automatically processed through OCR. Ensure documents are legible for accurate extraction.

For spreadsheet and data analysis

Use XLSX for formatted spreadsheets with formulas and structure. Use CSV for simple tabular data exports without formatting.

For email analysis

Upload email messages in EML or MSG format. Both preserve message structure and attachments for analysis.

For live conversations

Upload reference documents (PDF, DOCX, TIFF, EML, MSG) before using Live Assist so August can cross-reference statements in real time.

File Naming and Extension Handling

When uploading documents to August, include the correct file extension in the filename. August uses extensions to determine how to process, open, and download files.

Required Extensions for Upload

Uploads through Genius Mode and ReAct workflows require filenames with recognizable extensions. If a filename lacks a supported extension or uses an ambiguous format, the upload will be rejected.

Supported extensions include:

  • Documents: .docx, .doc, .pdf, .txt, .md, .html, .json

  • Spreadsheets: .xlsx, .xls, .csv

  • Presentations: .pptx, .ppt

  • Images: .jpg, .jpeg, .png, .gif, .bmp, .webp, .tif, .tiff

  • Audio: .mp3, .wav, .m4a

  • Email: .msg, .eml

How August Handles Missing or Incorrect Extensions

For files already uploaded to your workspace, August uses content-type detection to determine how to open and download files—even when the filename extension is missing or incorrect.

  • Office documents (.docx, .xlsx, .pptx) open in the correct viewer based on their actual content, not just the filename.

  • Downloads from chat, the file viewer, and work product views use the correct extension inferred from the file's content type.

  • Excel files are treated as download-only in the file viewer.

  • Date patterns in filenames (such as [2026.05.11]) are recognized as dates, not file extensions.

If you encounter a file that won't open correctly, try downloading it instead. The downloaded filename will have the correct extension based on the actual file type.

Example: Filename Normalization

If you upload a Word document named Agreement (no extension), August may normalize it to Agreement.docx based on the file's content. When you download the file later, it will have the correct extension.

Next Steps

Was this helpful?