Methodology

How we work with 6,362 real AAO decisions

A transparent disclosure of where our data comes from, how we process it, what we do not claim, and how to audit our work.

Data source

Our corpus is built from publicly available decisions of the USCIS Administrative Appeals Office (AAO) on Form I-140 petitions filed under the EB-2 National Interest Waiver provision. AAO decisions are published as non-precedential redacted PDFs under USCIS’s FOIA-driven public-access programme.

We do not have any private access to USCIS data, RFE templates, or unpublished officer guidance. Every case in our corpus is something a member of the public could download from USCIS’s decision archive.

Corpus size and scope

  • Total cases: 6,362
  • Form: I-140 (immigrant petition for alien worker)
  • Classification: EB-2 National Interest Waiver under INA § 203(b)(2)(B)(i)
  • Decision body: Administrative Appeals Office (AAO)
  • Outcome distribution (approximate): ~5,292 denied · ~533 remanded · ~375 approved (numbers refresh on corpus update)
  • Service-center distribution: Texas ~60% · Nebraska ~31% · Vermont ~4% · California ~2%

Important caveat. AAO cases are inherently filtered — these are petitions that were already denied at least once at USCIS and then appealed. AAO-level approval rates are therefore far lower than first-pass USCIS approval rates. We use AAO data to understand which arguments fail at the highest scrutiny level, not to predict first-pass outcomes.

Processing pipeline

  1. Acquisition. Decisions are downloaded as redacted PDFs from USCIS’s decision-archive endpoints.
  2. OCR + structural extraction. Each PDF is run through OCR and a structural parser to recover the case number, service centre, decision date, outcome (approve / deny / remand), and the prong-level reasoning.
  3. Field tagging. Petitioner profession, area of national priority, evidence types relied on, and primary reason for the outcome are extracted via a combination of regex pattern matching and LLM-based classification, then reviewed against a sampled validation set.
  4. Anonymisation check. AAO decisions are already redacted at publication; we additionally suppress any quoted text fragment containing more than two consecutive proper nouns at customer-facing surfaces.
  5. Indexing. Processed cases are embedded into a vector store used by the case-review service to retrieve the closest matches by profile.

What our analytics are derived from

Aggregate statistics (free /analysis dashboard)

Approval / denial / RFE rates, per-profession breakdowns, service-centre comparisons, quarterly trend lines, and denial-pattern frequency are all computed directly over the 6,362-decision corpus. Quarterly USCIS I-140 receipt / approval / pending statistics are sourced from public USCIS quarterly reports (Form I-140 Cases and Pending Inventory) and are credited to USCIS, not derived from our corpus.

Personalised case review

The $10 case review computes per-prong scores against the patterns recurring in approved-vs-denied cases for the petitioner’s profession bucket, returns the five most-similar cases retrieved by vector similarity, and surfaces an evidence-gap list. The score is a relative-positioning estimate, not a probability of approval.

Petition builder

The petition builder drafts the petition letter section by section, with retrieval-augmented generation grounded in (a) the recurring evidence patterns from approved AAO cases in the petitioner’s field, (b) the petitioner’s Google Scholar / OpenAlex profile where applicable, and (c) federal-policy citations (NIST, NSF, OSTP, agency strategic plans).

What we deliberately do not claim

  • No single accuracy percentage. Outcome prediction depends on factors a model cannot see — RFE strategy, officer assignment, expert-letter quality, mid-cycle policy changes. Publishing “X% accuracy” would be marketing, not measurement.
  • No fabricated reviews. We do not generate, seed, or buy reviews. Until we have verifiable third-party review collection (Trustpilot / G2), the platform’s structured data deliberately omits any aggregateRating node.
  • No “real-time policy updates” claim. Major USCIS policy memoranda are incorporated when published. Our corpus refreshes on a quarterly basis; we do not have a real-time feed from USCIS.
  • No legal advice. We are not a law firm, we do not have attorney-client privilege with customers, and our outputs are explicitly marked as data and document-generation, not legal counsel.

Refresh cadence

New AAO decisions are added to the corpus on a quarterly cadence. USCIS quarterly statistics on the /analysis dashboard are updated within ~30 days of USCIS publishing them. The last corpus refresh date is shown on the analytics dashboard.

How to audit our work

Every claim on the marketing surface is anchored either to (a) the 6,362-decision corpus or (b) the configuration in src/config/pricing.ts. There are no “X% accuracy” or “Y satisfied customers” claims you cannot trace.

If you would like to validate a specific statistic before purchasing, the free analytics dashboard exposes the same per-profession and per-service-centre breakdowns the paid products use. The profession lookup tool exposes the 6,362-decision corpus at the profession level.

Reporting issues

If you spot a number that looks wrong, an outdated USCIS statistic, or a methodological gap, email support@greenwayai.net. Corrections are logged publicly and the analytics dashboard timestamp updates accordingly.