School/AI for Business Operations/Customer & Admin Operations
2/4
Wave 412 minintermediate

Data Entry & Extraction

Pull structured data from unstructured text instantly.

Data Entry & Extraction

One of AI's genuine superpowers: turning messy, unstructured text into clean, structured data. Tasks that used to take an intern all day now take 30 seconds. This is not a marginal improvement -- it is an order-of-magnitude change in how fast you can process information.

Key Concept

The pattern for data extraction is always the same: tell AI exactly what fields you want, exactly what format to use, and what to do when data is missing or ambiguous. Specificity in, accuracy out.

Extracting Data from Text

The basic pattern -- tell AI what fields you want and what format to use:

"Extract the following information from this email/document and format as JSON:

- Company name

- Contact person

- Email address

- Phone number

- Requested service

- Budget mentioned

- Timeline

If any field is not found, use 'N/A'.

If you're unsure about an extraction, add a confidence note.

Document: [paste text]"

Invoice Processing

"Extract line items from this invoice text and format as a table:

| Item | Quantity | Unit Price | Total |

Also extract: Invoice number, date, vendor name, subtotal, tax, and grand total.

Double-check that line items add up to the subtotal. Flag any discrepancies.

Invoice: [paste invoice text]"

Watch Out

The "double-check the math" instruction is critical. AI sometimes misreads numbers, transposes digits, or gets confused by formatting. For any financial data extraction, always include a validation step and verify the totals yourself before acting on them.

Business Card / Contact Parsing

"Parse these business card details into a contact database format:

[paste business card text or multiple cards]

Format: CSV with columns: First Name, Last Name, Title, Company, Email, Phone, Address, LinkedIn URL

If a card has multiple phone numbers, use the mobile number for the Phone column and note others in a Notes column."

The Batch Processing Pattern

For processing many items at once, always provide an example first:

"Process each entry below. For each one, extract: name, date, amount, and category.

Example:

Input: 'Paid $450 to ABC Plumbing on March 15 for bathroom repair'

Output: Name: ABC Plumbing | Date: March 15 | Amount: $450 | Category: Home Repair

Now process these entries:

1. 'Bought $89 of office supplies at Staples on Tuesday'

2. 'Monthly Spotify subscription $14.99 charged Jan 1'

3. 'Dinner with client Sarah at Olive Garden $127.50 on 3/10'

4. 'Paid quarterly insurance premium $2,400 to State Farm'

5. 'Gas station fill-up $62.18 Shell on Highway 101'"

Pro Tip

Providing one example before the batch dramatically improves consistency. AI learns your expectations for format and detail level from that single example and applies it uniformly across all items. Without the example, you will get inconsistent formatting that takes time to clean up.

Cleaning Messy Data

Real-world data is never clean. AI handles the mess:

"This spreadsheet data has inconsistencies. Standardize it:

- Names: First Last format, proper capitalization

- Phone numbers: (XXX) XXX-XXXX format

- Addresses: Full format with ZIP code

- Dates: YYYY-MM-DD format

- Remove duplicates (keep the most complete entry)

Data: [paste messy data]"

Transforming Between Formats

"Convert this data from [format A] to [format B]:

- JSON to CSV

- Email thread to structured table

- Meeting notes to Jira tickets

- Resume text to database fields

[paste data]"

Tips for Accurate Extraction

  1. 1Always specify the output format (JSON, CSV, table, etc.) -- ambiguity kills accuracy
  2. 2Provide one example for complex extractions -- AI learns your expectations
  3. 3Ask AI to flag uncertain extractions with a confidence indicator
  4. 4Always verify extracted numbers -- AI occasionally misreads or transposes digits
  5. 5For critical data, ask AI to double-check itself: "Review your extraction above. Did you miss anything or make any errors?"
  6. 6Chunk large datasets -- Process 20-50 items at a time, not 500

Exercises

0/4
Prompt Challenge+20 XP

Find a real receipt, invoice, or email with data in it. Use AI to extract all data into a structured JSON format. Check every field against the original -- how accurate was the extraction?

Hint: Try a receipt with at least 5 line items. Check the math on totals -- that's where AI most commonly makes mistakes.

Prompt Challenge+15 XP

Create a batch processing prompt that converts 5 informal expense descriptions into a structured expense report table with columns: Date, Vendor, Amount, Category, Payment Method.

Hint: Make up realistic entries like "coffee meeting with client $12 at Starbucks." Always provide one example for AI to follow.

Quiz+5 XP

Why should you always provide an example when using the batch processing pattern?

Fill in the Blank+5 XP

When extracting critical data, you should ask AI to _______ itself to catch missed items or errors.