DexClient
The main client for interacting with the Dex service.Project Management
create_project(name, configuration)- Create a new project with optional configurationlist_projects()- List all accessible projectsget_project(project_id)- Retrieve a specific projectupdate_project(project_id, updates)- Update project name, configuration, or status
Project
Represents a Dex project with isolated data and credentials.File Operations
upload_file(file_path)- Upload a document to the projectlist_files()- List all uploaded filesget_file(file_id)- Get file metadatadownload_file(file_id)- Download file content
Vector Store Operations
create_vector_store(name, engine, embedding_model)- Create a vector store with SGP Knowledge Base enginelist_vector_stores()- List all vector storesget_vector_store(vector_store_id)- Get vector store detailsdelete_vector_store(vector_store_id)- Delete a vector store
DexFile
Represents an uploaded file in Dex.Parsing
parse(params)- Parse document to structured format
Working with Parse Results
After parsing, you can access the structured content including chunks and blocks. Example:ParseResult
Represents the result of a document parsing operation.Extraction
extract(extraction_schema, user_prompt, model, generate_citations, generate_confidence)- Extract structured data with user prompt, schema, model, and options
extraction_schema(BaseModel): Pydantic model class for extraction (pass the class directly, notmodel_json_schema())user_prompt(str): Natural language instructions for extractionmodel(str): LLM model to use (e.g., “openai/gpt-4o”)generate_citations(bool): Include source citations in resultsgenerate_confidence(bool): Include confidence scores in results
Working with Extraction Results
After extraction, you can access the structured data, citations, and confidence scores. Example:VectorStore
Represents a vector store for semantic search and RAG-enhanced extraction.Indexing
add_parse_results(parse_result_ids)- Add parsed documents to vector store by parse result IDsremove_files(file_ids)- Remove files from index
Search
search(query, top_k, filters)- Semantic search across all documents in the vector storesearch_in_file(file_id, query, top_k, filters)- Search within a specific file with optional filters
Extraction
extract(extraction_schema, user_prompt, model, generate_citations, generate_confidence)- Extract structured data from entire vector store with RAG context
Parse Job Parameters
When parsing documents, you can specify different engines and options to customize the parsing behavior.Reducto Parse Parameters
ReductoParseJobParams - Parameters for the Reducto OCR engine. Best for: English and Latin-script documents (Spanish, French, German, Italian, Portuguese, etc.) with tables, figures, and complex layouts. Fields:engine(ParseEngine): Set toParseEngine.REDUCTOoptions(ReductoParseEngineOptions): Parsing optionsadvanced_options(dict): Advanced options for fine-tuningexperimental_options(dict): Experimental featurespriority(bool): Whether to prioritize this job (default: False)
chunking(ReductoChunkingOptions | None): Chunking configuration
chunk_mode(ReductoChunkingMethod): Chunking method (default:VARIABLE)DISABLED: No chunkingBLOCK: Block-level chunksPAGE: Page-level chunksPAGE_SECTIONS: Page sectionsSECTION: Section-level chunksVARIABLE: Variable-size chunks based on content
chunk_size(int | None): Custom chunk size
Iris Parse Parameters
IrisParseJobParams - Parameters for the Iris OCR engine. Best for: Non-English, non-Latin scripts including Arabic, Hebrew, Chinese (CJK), Japanese, Korean, Thai, Hindi, and other Indic languages. Fields:engine(ParseEngine): Set toParseEngine.IRISoptions(IrisParseEngineOptions): Parsing options
layout(str | None): Layout detection model to usetext_ocr(str | None): Text OCR model to usetable_ocr(str | None): Table OCR model to usetext_prompt(str | None): Custom prompt for text extraction (VLMs only)table_prompt(str | None): Custom prompt for table extraction (VLMs only)left_to_right(bool | None): Sort regions left-to-right instead of right-to-left (default: False)confidence_threshold(float | None): Minimum confidence threshold for layout detectioncontainment_threshold(float | None): Containment threshold for filtering overlapping boxes
Common Types
This section documents the core data models and types used throughout the Dex SDK.Type Categories
Importable Types - Types you can import fromdex_sdk.types to configure your requests:
- Configuration types (ProjectConfiguration, RetentionPolicy)
- Parse parameter types (ReductoParseJobParams, IrisParseJobParams, etc.)
- Enum types (ParseEngine, ReductoChunkingMethod, VectorStoreEngines)
.data attribute on wrapper objects:
- When you call SDK methods, you get wrapper objects (DexProject, DexFile, DexParseResult, etc.)
- Access the underlying data via
.data:project.data.id,file.data.filename - These entities are automatically validated but don’t need to be imported
Configuration Types
ProjectConfiguration
Configuration options for a Dex project. Import:from dex_sdk.types import ProjectConfiguration
Fields:
retention(RetentionPolicy | None): Data retention policy for the project
RetentionPolicy
Defines data retention periods for automatic cleanup of files and processing artifacts. Import:from dex_sdk.types import RetentionPolicy
Fields:
files(timedelta | None): Retention period for uploaded files. Files older than this period are automatically deleted. IfNone, files are retained indefinitely.result_artifacts(timedelta | None): Retention period for parse results, extraction results, and job artifacts. IfNone, artifacts are retained indefinitely.
- Compliance: Meet regulatory requirements (GDPR, HIPAA, etc.)
- Cost Management: Automatically clean up old data to reduce storage costs
- Security: Limit exposure of sensitive documents by enforcing retention limits
Note: The retention period is calculated from the creation time of the file or artifact. Retention policies can be updated at any time using update_project().
ExtractionParameters
Parameters for extraction operations. Import:from dex_sdk.types import ExtractionParameters
Fields:
model(str): LLM model to use (e.g.,"openai/gpt-4o")model_kwargs(dict | None): Additional kwargs for the LLM modelextraction_schema(dict): JSON schema defining the desired output structuresystem_prompt(str | None): High-level instructions for the extraction modeluser_prompt(str | None): Specific hints about the current documentgenerate_citations(bool): Whether to return bounding boxes for extracted values (default: True)generate_confidence(bool): Whether to return confidence scores (default: True)
Parse Configuration Types
ParseEngine
Enum of available OCR engines. Import:from dex_sdk.types import ParseEngine
Values:
REDUCTO= “reducto”IRIS= “iris”CUSTOM= “custom”
ReductoParseJobParams
Parameters for the Reducto OCR engine. Import:from dex_sdk.types import ReductoParseJobParams
See the Parse Job Parameters section for detailed usage.
IrisParseJobParams
Parameters for the Iris OCR engine. Import:from dex_sdk.types import IrisParseJobParams
See the Parse Job Parameters section for detailed usage.
ReductoChunkingMethod
Enum of chunking methods for Reducto parser. Import:from dex_sdk.types import ReductoChunkingMethod
Values:
DISABLED= “disabled”BLOCK= “block”PAGE= “page”PAGE_SECTIONS= “page_sections”SECTION= “section”VARIABLE= “variable”
ReductoChunkingOptions
Chunking configuration for Reducto parser. Import:from dex_sdk.types import ReductoChunkingOptions
Fields:
chunk_mode(ReductoChunkingMethod): Chunking methodchunk_size(int | None): Custom chunk size
ReductoParseEngineOptions
Options for Reducto parser. Import:from dex_sdk.types import ReductoParseEngineOptions
Fields:
chunking(ReductoChunkingOptions | None): Chunking configuration
IrisParseEngineOptions
Options for Iris parser. Import:from dex_sdk.types import IrisParseEngineOptions
Fields:
layout(str | None): Layout detection modeltext_ocr(str | None): Text OCR modeltable_ocr(str | None): Table OCR modeltext_prompt(str | None): Custom prompt for text extractiontable_prompt(str | None): Custom prompt for table extractionleft_to_right(bool | None): Sort regions left-to-rightconfidence_threshold(float | None): Minimum confidence thresholdcontainment_threshold(float | None): Containment threshold for filtering
Vector Store Types
VectorStoreEngines
Enum of available vector store engines. Import:from dex_sdk.types import VectorStoreEngines
Values:
SGP_KNOWLEDGE_BASE= “sgp_knowledge_base”
VectorStoreSearchResult
Result from vector store search operations. Import:from dex_sdk.types import VectorStoreSearchResult
Response Entity Types
These types are returned by SDK methods and accessed via the.data attribute on wrapper objects. You typically don’t need to import these directly.
Working with Response Data
When you call SDK methods, you receive wrapper objects with a.data attribute:
Common Response Entity Fields
ProjectEntity (accessed viaproject.data):
id(str): Project ID withproj_prefixname(str): Project readable namestatus(str): Project status ("active"or"archived")configuration(ProjectConfiguration | None): Project configurationcreated_at(datetime): When the project was createdarchived_at(datetime | None): When the project was archived
dex_file.data):
id(str): File ID withfile_prefixproject_id(str): Project ID that the file belongs tofilename(str): Original filenamesize_bytes(int): File size in bytesmime_type(str): MIME type of the filestatus(str): Current file statuscreated_at(datetime): When the file was uploaded
parse_result.data):
id(str): Parse result ID withpres_prefixproject_id(str): Project IDsource_document_id(str): Source document ID that was parsedengine(str): Engine used for parsingparse_metadata(object): Metadata includingfilename,pages_processedcontent(object): Parsed content withchunkscreated_at(datetime): When the parse result was created
extract_result or in extraction results):
id(str): Extraction result IDsource_id(str): Source ID that was extracted fromresult(object): The extraction result withdataandusage_infoparameters(ExtractionParameters): Parameters used for extractioncreated_at(datetime): When the extraction was completedprocessing_time_ms(int | None): Processing time in milliseconds
vector_store.data):
id(str): Vector store ID withvs_prefixproject_id(str): Project IDname(str): Name of the vector storeengine(str): Engine used for vector storecreated_at(datetime): When the vector store was created
Deprecated Types
The following types are deprecated as of version 0.3.2 and should no longer be used:- ProjectCredentials - No longer used; credentials are passed to
DexClientconstructor - SGPCredentials - No longer used; credentials are passed to
DexClientconstructor
Error Handling
The SDK raises exceptions for various error conditions. For detailed troubleshooting guidance, see the Troubleshooting Guide.Async/Await Pattern
The Dex SDK is fully async. Useawait with all SDK methods:
See Also
- Getting Started Guide: Quick start tutorial for beginners
- Quick Reference: Cheat sheet for common patterns and imports
- Advanced Features: Vector stores, chunking strategies, and optimization
- Introduction to Dex: Core concepts and architecture overview
- Troubleshooting Guide: Common issues and solutions

