Unstructured
This example covers how to use Unstructured.io to load files of many types. Unstructured currently supports loading of text files, powerpoints, html, pdfs, images, and more.
Setup
You can run Unstructured locally in your computer using Docker. To do so, you need to have Docker installed. You can find the instructions to install Docker here.
docker run -p 8000:8000 -d --rm --name unstructured-api downloads.unstructured.io/unstructured-io/unstructured-api:latest --port 8000 --host 0.0.0.0
Usage
Once Unstructured is running, you can use it to load files from your computer. You can use the following code to load a file from your computer.
import { UnstructuredLoader } from "@langchain/community/document_loaders/fs/unstructured";
const options = {
apiKey: "MY_API_KEY",
};
const loader = new UnstructuredLoader(
"src/document_loaders/example_data/notion.md",
options
);
const docs = await loader.load();
API Reference:
- UnstructuredLoader from
@langchain/community/document_loaders/fs/unstructured
Directories
You can also load all of the files in the directory using UnstructuredDirectoryLoader
, which inherits from DirectoryLoader
:
import { UnstructuredDirectoryLoader } from "@langchain/community/document_loaders/fs/unstructured";
const options = {
apiKey: "MY_API_KEY",
};
const loader = new UnstructuredDirectoryLoader(
"langchain/src/document_loaders/tests/example_data",
options
);
const docs = await loader.load();
API Reference:
- UnstructuredDirectoryLoader from
@langchain/community/document_loaders/fs/unstructured