OVERVIEW
We are a legal AI workflow consultancy building a document intelligence and automation system for a law firm that handles large-volume litigation cases.
The law firm stores all client documents in Microsoft OneDrive. The project requires building an automated pipeline that:
Extracts structured data from approximately one million documents stored in OneDrive using Azure Document Intelligence
Sends the extracted data to Claude (Anthropic's AI) for analysis according to specific instructions provided by the client
Delivers the analyzed results to a legal case management system (Practice Panther) for staff review
This pipeline will replace a manual document review process that currently requires multiple full-time contractors. The goal is a reliable, automated system that processes large document sets consistently and delivers structured, reviewable output to the firm's staff.
This is a multi-phase engagement covering four workflows. We are hiring one contractor to build all four phases so the system is architecturally consistent from start to finish. You will work directly with our workflow consultant who handles system design and client coordination. Your role is technical implementation.
WHAT YOU WILL BUILD
The core pipeline for Phase 1 works as follows:
Step 1 — Connect to OneDrive
Step 2 — Extract data using Azure Document Intelligence
Process each document through Azure Document Intelligence to extract structured text and field-level data. This includes tax returns, bank statements, financial documents, emails, and supporting case files. The system needs to handle PDFs, scanned documents, and image files. Custom model training on the firm's specific document formats will be required.
Step 3 — Reduce and filter the document set
Remove duplicate documents and filter out irrelevant files before analysis. The goal is to reduce the full document set from approximately one million documents down to the most relevant subset — typically 20,000 to 50,000 documents — before sending anything to AI. This step is critical for cost efficiency and analysis quality.
Step 4 — Index documents for retrieval
Step 5 — Send extracted data to Claude for analysis
Pass the filtered and indexed document data to Claude (Anthropic's AI API) along with the client's specific analysis instructions. Claude will analyze the data according to those instructions — for example, identifying patterns, flagging anomalies, comparing documents, or summarizing findings across a set of records.
Step 6 — Deliver results to Practice Panther for staff review
PHASE BREAKDOWN
This engagement covers four phases. All phases will be built by one contractor. Phase 1 is the priority and where work begins.
Phase
What Gets Built
Key Tools
Phase 1 — Priority
OneDrive connection, document extraction via Azure, deduplication, OpenSearch indexing, Claude AI analysis, Practice Panther delivery
Azure Document Intelligence, Microsoft Graph API, OpenSearch, Claude API, N8N, Practice Panther API
Phase 2
Client document upload portal with mandatory categorization, automatic JPEG-to-PDF conversion, auto-sort into correct OneDrive folders by client and document type
N8N, OneDrive API, Python, client portal
Phase 3
Microsoft Teams conversation notes automatically synced to the correct Practice Panther client matter file — eliminates manual double-entry
Teams Webhooks, Practice Panther API, N8N
Phase 4
VoIP call auto-transcription using Vonage, caller ID matched to Practice Panther client records, transcripts automatically logged to the correct client file
Vonage API, N8N, Practice Panther API
REQUIRED TECHNICAL SKILLS
You must have hands-on, demonstrated experience with the following. Theoretical knowledge is not sufficient — we will ask for portfolio examples and will verify through screening questions.
PREFERRED QUALIFICATIONS
Prior experience with legal technology, eDiscovery platforms, or document-heavy professional services industries
Experience building systems that process high volumes of documents at scale — not just small proof-of-concept builds
Familiarity with the Microsoft 365 and Azure ecosystem, including authentication and API patterns across Microsoft services
Experience designing modular, maintainable automation systems that can be extended over time — not one-off scripts
Strong documentation habits — detailed architecture notes, workflow exports, and API configuration documentation are required on handoff, not optional
Ability to identify technical risks proactively and recommend architecture improvements before they become problems
WHAT WE ARE NOT LOOKING FOR
Please do not apply if your primary experience is any of the following. These are not the skills this project requires.
Prompt engineering or ChatGPT-based automation without underlying pipeline and API work
Simple no-code Zapier or Make workflows without custom API logic or scripting
Proof-of-concept builds that have not handled production-scale document volumes
Candidates who cannot explain their system architecture clearly and concisely
Candidates who cannot commit to full documentation on handoff
HOW TO APPLY
To be considered your proposal must include all five of the following. Proposals that do not address all five points will not be reviewed.
A brief description of the most complex document processing pipeline you have built — what tools you used, what volume of documents it handled, and what the outcome was
Your written answer to this question: How would you design a system to extract data from one million documents stored in OneDrive, reduce that set to the most relevant documents, and then send the extracted data to an AI model for analysis — without sending everything to AI at once?
At least one portfolio link or case study showing automation work with Azure Document Intelligence, N8N, or a comparable pipeline tool
Your proposed engagement structure — fixed-price milestones or hourly rate with estimated hours per phase, or a hybrid — with estimated costs for Phase 1
The number of relevant automation or pipeline contracts you have completed
Note: This role requires a Job Success Score of 90% or higher and at least five completed contracts with demonstrated relevance to document processing, API integration, or workflow automation. We will review your full Upwork profile and work history before scheduling a call.