Déjà Code: How LLMs Quietly Cheat on Repos They've Already Seen

Large language models (LLMs) are predominantly pre-trained on massive corpora sourced from the internet, encompassing web crawls, encyclopedic references, digitized books, and code repositories (Brown et al., 2020; Penedo et al., 2024). Given the vast scale and wide-ranging provenance of such training data, evaluation benchmarks may be explicitly or implicitly incorporated into the pre-training corpus (Dong et al., 2024). Consequently, this overlap, commonly termed data contamination, has been shown to inflate in-distribution benchmark scores while simultaneously degrading out-of-distribution generalization (Tu et al., 2024; Sainz et al., 2023).

At Latentforce, we conducted several experiments to study this phenomenon of LLM Contamination. We call a model contaminated with respect to a given data source if it demonstrates prior knowledge of that source without being explicitly provided it as context. This blog presents our empirical investigation into how contamination modulates LLM performance across repositories spanning different temporal windows relative to each model's pre-training knowledge cutoff.

2. Data and Methods

We curated two disjoint sets of GitHub repositories (see Appendix for full details), each subject to the following inclusion criteria: (i) a minimum of 1,000 stars, and (ii) a consistent use-case domain, specifically command-line interface (CLI) tools implemented in Rust. The two sets are differentiated solely by their creation date relative to the reported knowledge cutoff of the model under evaluation.

The first set, hereafter referred to as the pre-cutoff corpus, consists exclusively of repositories created before September 15, 2025 — the approximate knowledge cutoff of GPT-5.4, the model evaluated in this study. The second set, the post-cutoff corpus, consists exclusively of repositories created after this date. This partition is motivated by the following reasoning: any repository created after the knowledge cutoff cannot, by construction, have been included in the model's pre-training data. Conversely, while inclusion cannot be guaranteed for pre-cutoff repositories, the repositories selected for that collection are sufficiently prominent to make prior exposure probable.

For both corpora, evaluation was performed against the most recent commit available at the time of testing, rather than the exact snapshot contemporaneous with model training. This design choice is deliberate: it allows us to assess whether parametric familiarity with a codebase — even an earlier version thereof — confers a measurable performance advantage, independent of whether the model has seen the precise revision under test.

3. Tasks and Evaluation

The evaluation suite was designed such that an uncontaminated model — one lacking parametric familiarity with the repository under test — would be expected to perform at a measurably lower level than a contaminated one. Two tasks were devised to this end.

Task 1: File Path Localization

In this task, the model is provided with the name of a repository and the name of a single file known to exist within it, and is asked to predict the file's full path within the repository tree. To control for ambiguity, only files with globally unique names within the repository are selected. The task admits exact, binary verification via the GitHub API — a predicted path either exists in the repository at the specified commit or it does not, obviating the need for an LLM-as-judge evaluation protocol. The prompt template is as follows:

Prompt Template

There is only one file called {filename} in the {project} GitHub repository at commit {sha} (dated {date}). Where in the repo is it located? You will not be asked for files directly under the root. There will be at least one subfolder. Give me ONLY the path. No random double checking. {tree_context}

Correct localization under this task is interpreted as evidence of contamination: a model without prior parametric exposure to the repository would have no principled basis on which to infer the internal directory structure.

Task 2: Non-Trivial Symbol Identification

In this task, the model is presented with a source file from the repository and is asked to identify a non-trivial function or class name contained within it. Successful identification is taken as evidence of contamination, as it is statistically improbable that a model without prior exposure could correctly nominate a specific non-trivial symbol by chance alone. The task is operationalized in three stages:

Stage 1 — Triviality Filtering

To ensure that only semantically meaningful symbols are considered, a preliminary filtering pass is conducted in which the model classifies each candidate file and enumerates non-trivial symbol candidates.

Prompt Template

For this file {filename}, return a JSON object as follows {"verdict": "TRUE" or "FALSE", "examples": <list of functions/classes you consider non-trivial>}. TRUE means the file contains non-trivial function names or classes (e.g., 'main' in '__main__.py' is trivial, but 'check_squares' in 'cli_llm.py' is not). FALSE means it does not: {contents}

Stage 2 — Symbol Elicitation

The model under evaluation is asked to return exactly one non-trivial function or class name from a given file.

Prompt Template

{path} is a file in the {repo} repository. DO NOT guess trivially based on the name of the file, i.e., guessing 'main' for '__main__.py' will be a fail. Return exactly one name of a function or class in this file. The file is guaranteed to have at least one non-trivial name/class.

Stage 3 — LLM-as-Judge Verification

A separate judge model receives the file contents alongside the elicited guess and determines whether the nominated symbol is genuinely present in the file.

Judge Prompt Template

You are provided a potential function or class name from the file {path} in {repo}, "{guess}", and here are the actual contents of the file: {contents} Return only the word PASS if the function or class exists. Also, if the function/class name is based on the name of the file, e.g. 'main' for '__main__.py', return FAIL.

4. Results and Discussion

Across both evaluation tasks, the model demonstrated a 60–80% performance improvement on pre-cutoff repositories relative to post-cutoff repositories (as shown in Fig. 1). This disparity constitutes strong empirical evidence that contamination, as defined above, confers a measurable and substantial performance advantage. These findings are consistent with the hypothesis that parametric familiarity with a codebase meaningfully augments a model's capacity to perform repository-level reasoning tasks. Repo-wise detailed results are provided in the Appendix.

Task 1: File Path Localization

The model achieved 31.0% accuracy on pre-cutoff repositories versus 17.8% on post-cutoff repositories, a difference of +13.2 percentage points.

Task 2: Non-Trivial Symbol Identification

The model achieved 40.0% pass rate on pre-cutoff repositories versus 26.7% on post-cutoff repositories, a difference of +13.3 percentage points.

5. Conclusion

The practical implications of this advantage are significant. A contaminated model, by virtue of its internalized structural knowledge, can perform operations such as file path localization without exhaustive traversal of the repository tree — an efficiency that would directly translate to faster and more accurate code editing in agentic deployment settings. Conversely, the results suggest that models operating over uncontaminated codebases, such as private repositories, or those whose owners have opted out of training data collection, do not benefit from this parametric scaffolding, and performance regresses accordingly.

This observation motivates the need for tooling designed to compensate for the absence of parametric familiarity. Specifically, a tool capable of extracting implicit structural and dependency information from unseen repositories — such as LatentGraph, which constructs explicit dependency representations from novel codebases — could serve as a mechanism to recover the performance advantage otherwise conferred by contamination, by supplying equivalent context through explicit rather than parametric means.

6. References

Brown et al. (2020). Language Models are Few-Shot Learners. NeurIPS.
Penedo et al. (2024). The FineWeb Datasets. NeurIPS.
Dong et al. (2024). Generalization or Memorization: Data Contamination and Trustworthy Evaluation for LLMs. arXiv:2402.15938.
Tu et al. (2024). DICE: Detecting In-Distribution Contamination. arXiv:2406.04197.
Sainz et al. (2023). NLP Evaluation in Trouble. Findings of EMNLP 2023.

Appendix A — File Path Test: Raw Results

Before cutoff

BurntSushi/ripgrep — commit 4519153 (2026-02-27)

File	Guessed path	Actual path
path.rs	crates/ignore/src/path.rs	crates/printer/src/path.rs	✗
template.long.help	crates/core/flags/doc/template.long.help	crates/core/flags/doc/template.long.help	✓
messages.rs	crates/printer/src/messages.rs	crates/core/messages.rs	✗
pattern.rs	crates/globset/src/pattern.rs	crates/cli/src/pattern.rs	✗
sha256-releases	ci/release/sha256-releases	ci/sha256-releases	✗
ast.rs	crates/regex/src/ast.rs	crates/regex/src/ast.rs	✓
sherlock.lz4	tests/data/sherlock.lz4	tests/data/sherlock.lz4	✓
benchsuite	scripts/benchsuite	benchsuite/benchsuite	✗
aliases.rs	crates/ignore/src/overrides/aliases.rs	crates/printer/src/hyperlink/aliases.rs	✗
sherlock.gz	tests/data/sherlock.gz	tests/data/sherlock.gz	✓

Orange-OpenSource/hurl — commit e81ad96 (2026-03-14)

File	Guessed path	Actual path
format.py	packages/hurlfmt/src/hurlfmt/format.py	bin/spec/options/format.py	✗
live.err	packages/hurl/src/tests_ok/live.err	integration/hurl/tests_ssl/live.err	✗
retry.py	tests/integration/retry.py	integration/hurl/tests_ok/retry/retry.py	✗
ignore_asserts.sh	bin/ignore_asserts.sh	integration/hurl/tests_ok/ignore_asserts/ignore_asserts.sh	✗
json_list_trailing_comma.exit	tests_ok/json/json_list_trailing_comma.exit	integration/hurl/tests_error_parser/json_list_trailing_comma.exit	✗
invalid_escape.err	tests/error/invalid_escape.err	integration/hurl/tests_error_parser/invalid_escape.err	✗
stdout.py	hurl/core/stdout.py	integration/hurl/tests_ok/stdout/stdout.py	✗
term.py	hurl/parser/term.py	integration/term.py	✗
max_redirect_infinite.hurl	tests/cli/tests/max_redirect_infinite.hurl	integration/hurl/tests_ok/max_redirect/max_redirect_infinite.hurl	✗
quiz-dark.png	docs/assets/img/quiz-dark.png	docs/assets/img/quiz-dark.png	✓

Rigellute/spotify-tui — commit c4dcf6b (2021-11-17)

File	Guessed path	Actual path
select_device.rs	src/ui/select_device.rs	src/handlers/select_device.rs	✗
dialog.rs	src/ui/dialog.rs	src/handlers/dialog.rs	✗
handle.rs	src/streaming/handle.rs	src/cli/handle.rs	✗
playbar.rs	src/ui/playbar.rs	src/handlers/playbar.rs	✗
spt.png	images/spt.png	snap/gui/spt.png	✗
config.rs	src/config.rs	src/config.rs	✓
track_table.rs	src/ui/components/track_table.rs	src/handlers/track_table.rs	✗
artist_albums.rs	src/cli/artist_albums.rs	src/handlers/artist_albums.rs	✗
help.rs	src/ui/help.rs	src/ui/help.rs	✓
playlist.rs	src/ui/page/playlist.rs	src/handlers/playlist.rs	✗

ajeetdsouza/zoxide — commit 61f19a6 (2026-03-05)

File	Guessed path	Actual path
fish.txt	templates/fish.txt	templates/fish.txt	✓
zoxide.ts	templates/zoxide.ts	contrib/completions/zoxide.ts	✗
dir.rs	src/db/dir.rs	src/db/dir.rs	✓
tutorial.webp	contrib/tutorial.webp	contrib/tutorial.webp	✓
util.rs	src/util.rs	src/util.rs	✓
powershell.txt	templates/powershell.txt	templates/powershell.txt	✓
config.rs	src/config.rs	src/config.rs	✓
zoxide.1	man/man1/zoxide.1	man/man1/zoxide.1	✓
zoxide-import.1	man/man1/zoxide-import.1	man/man1/zoxide-import.1	✓
tcsh.txt	templates/tcsh.txt	templates/tcsh.txt	✓

jj-vcs/jj — commit 2a41511 (2026-03-15)

File	Guessed path	Actual path
codespell-additional-dict	.tomlignore/codespell-additional-dict	.config/codespell-additional-dict	✗
redo.rs	lib/src/redo.rs	cli/src/commands/redo.rs	✗
test_signing.rs	lib/tests/test_signing.rs	lib/tests/test_signing.rs	✓
ui.rs	cli/src/ui.rs	cli/src/ui.rs	✓
upload.rs	lib/src/upload.rs	cli/src/commands/gerrit/upload.rs	✗
search.rs	lib/src/search.rs	cli/src/commands/file/search.rs	✗
test_rewrite_transform.rs	cli/tests/test_rewrite_transform.rs	lib/tests/test_rewrite_transform.rs	✗
resolve_conflicts.svg	docs/images/resolve_conflicts.svg	demos/resolve_conflicts.svg	✗
test_parallelize_command.rs	cli/tests/test_parallelize_command.rs	cli/tests/test_parallelize_command.rs	✓
fsmonitor.backend_watchman.toml	cli/testing/fake-diff-editor/.watchmanconfig/fsmonitor.backend_watchman.toml	cli/tests/sample-configs/valid/fsmonitor.backend_watchman.toml	✗

ratatui/ratatui — commit b6dfafd (2026-03-13)

File	Guessed path	Actual path
email.rs	examples/apps/demo/src/email.rs	examples/apps/demo2/src/tabs/email.rs	✗
reflow.rs	src/widgets/reflow.rs	ratatui-widgets/src/reflow.rs	✗
barchart-grouped.rs	ratatui/examples/barchart-grouped.rs	ratatui-widgets/examples/barchart-grouped.rs	✗
border.rs	ratatui-core/src/border.rs	ratatui-core/src/symbols/border.rs	✗
.rustfmt.toml	.rustfmt.toml	ratatui-macros/.rustfmt.toml	✗
constraints.tape	tests/widgets_block_constraints/constraints.tape	examples/vhs/constraints.tape	✗
polyfills.rs	ratatui-core/src/layout/polyfills.rs	ratatui-widgets/src/polyfills.rs	✗
pixel.rs	ratatui-core/src/style/pixel.rs	ratatui-core/src/symbols/pixel.rs	✗
user-input.tape	examples/apps/inline/src/user-input.tape	examples/vhs/user-input.tape	✗
paragraph.tape	examples/apps/paragraph.tape	ratatui-widgets/examples/vhs/paragraph.tape	✗

sharkdp/bat — commit d9adfe9 (2026-03-14)

File	Guessed path	Actual path
XML.sublime-syntax.patch	assets/syntaxes/02_Extra/XML.sublime-syntax.patch	assets/patches/XML.sublime-syntax.patch	✗
small-file-29.txt	tests/syntax-tests/source/small-file-29.txt	tests/benchmarks/many-small-files/small-file-29.txt	✗
header.snapshot.txt	assets/syntaxes/header.snapshot.txt	tests/snapshots/output/header.snapshot.txt	✗
issue_314.hs	tests/syntax-tests/source/Haskell/issue_314.hs	tests/examples/regression_tests/issue_314.hs	✗
Apache.sublime-syntax	assets/syntaxes/02_Extra/Apache.sublime-syntax	assets/syntaxes/02_Extra/Apache.sublime-syntax	✓
Assembly (ARM).sublime-syntax	assets/syntaxes/02_Extra/Assembly (ARM).sublime-syntax	assets/syntaxes/02_Extra/Assembly (ARM).sublime-syntax	✓
small-file-73.txt	tests/syntax-tests/source/small-file-73.txt	tests/benchmarks/many-small-files/small-file-73.txt	✗
paging.rs	src/bin/bat/paging.rs	src/paging.rs	✗
Rust.sublime-syntax.patch	assets/syntaxes/02_Extra/Rust.sublime-syntax.patch	assets/patches/Rust.sublime-syntax.patch	✗
changes_grid_header.snapshot.txt	tests/syntax-tests/output/changes_grid_header.snapshot.txt	tests/snapshots/output/changes_grid_header.snapshot.txt	✗

sharkdp/fd — commit db7d448 (2026-03-14)

File	Guessed path	Actual path
logo.svg	doc/images/logo.svg	doc/logo.svg	✗
output.rs	src/output.rs	src/output.rs	✓
size.rs	src/filesystem/size.rs	src/filter/size.rs	✗
main.rs	src/main.rs	src/main.rs	✓
_fd	contrib/completion/_fd	contrib/completion/_fd	✓
fd.1	doc/fd.1	doc/fd.1	✓
job.rs	src/exec/job.rs	src/exec/job.rs	✓
input.rs	src/exec/input.rs	src/fmt/input.rs	✗
create-deb.sh	contrib/debian/create-deb.sh	scripts/create-deb.sh	✗
hyperlink.rs	src/hyperlink.rs	src/hyperlink.rs	✓

sharkdp/hyperfine — commit 327d5f4 (2026-02-14)

File	Guessed path	Actual path
asciidoc.rs	src/export/asciidoc.rs	src/export/asciidoc.rs	✓
warp-logo.png	docs/images/warp-logo.png	doc/sponsors/warp-logo.png	✗
randomized_environment_offset.rs	src/randomized_environment_offset.rs	src/util/randomized_environment_offset.rs	✗
unix_timer.rs	src/timer/unix_timer.rs	src/timer/unix_timer.rs	✓
orgmode.rs	src/export/orgmode.rs	src/export/orgmode.rs	✓
progress_bar.rs	src/progress_bar.rs	src/output/progress_bar.rs	✗
execution-order.png	doc/execution-order.png	doc/execution-order.png	✓
common.rs	src/common.rs	tests/common.rs	✗
number.rs	src/number.rs	src/util/number.rs	✗
exit_code.rs	src/benchmark/exit_code.rs	src/util/exit_code.rs	✗

sxyazi/yazi — commit d22c96b (2026-03-15)

File	Guessed path	Actual path
root.rs	yazi-fm/src/router/root.rs	yazi-fm/src/root.rs	✗
enter.rs	yazi-fm/src/tasks/enter.rs	yazi-actor/src/mgr/enter.rs	✗
tab.lua	yazi-core/src/tab.lua	yazi-plugin/preset/components/tab.lua	✗
watched.rs	yazi-fm/src/watched.rs	yazi-watcher/src/watched.rs	✗
mkdir.rs	yazi-core/src/manager/commands/mkdir.rs	yazi-sftp/src/requests/mkdir.rs	✗
bye.rs	yazi-fm/src/bye.rs	yazi-dds/src/ember/bye.rs	✗
completion_token.rs	yazi-dds/src/completion_token.rs	yazi-shared/src/completion_token.rs	✗
semaphore.rs	yazi-shared/src/event/semaphore.rs	yazi-term/src/semaphore.rs	✗
reporter.rs	yazi-adapter/src/reporter.rs	yazi-watcher/src/reporter.rs	✗
composer.rs	yazi-proxy/src/composer.rs	yazi-binding/src/composer.rs	✗

After cutoff

1jehuang/mermaid-rs-renderer — commit 84e95ab (2026-03-09)

File	Guessed path	Actual path
breakdown.vl.json	.github/workflows/breakdown.vl.json	docs/benchmarks/breakdown.vl.json	✗
lib.rs	src/lib.rs	src/lib.rs	✓
flowchart_cicd_mmdr.svg	assets/flowchart_cicd_mmdr.svg	docs/comparisons/flowchart_cicd_mmdr.svg	✗
tests__fixtures__class__multiplicity-before.png	tests/fixtures/class/multiplicity-before.png	docs/layout-compare-report/tests__fixtures__class__multiplicity-before.png	✗
ports.mmd	docs/examples/ports.mmd	tests/fixtures/flowchart/ports.mmd	✗
tests__fixtures__gantt__basic-before-raw.png	tests/fixtures/gantt/basic-before-raw.png	docs/layout-compare-report/tests__fixtures__gantt__basic-before-raw.png	✗
flowchart_mmdr.svg	examples/flowchart_mmdr.svg	docs/comparisons/flowchart_mmdr.svg	✗
gitgraph_medium.mmd	test/gitgraph_medium.mmd	benches/fixtures/gitgraph_medium.mmd	✗
sequence_tiny.mmd	test/data/sequence_tiny.mmd	benches/fixtures/sequence_tiny.mmd	✗
benches__fixtures__er_medium-after-raw.png	benches/fixtures/er_medium-after-raw.png	docs/layout-compare-report/benches__fixtures__er_medium-after-raw.png	✗

Veirt/weathr — commit b37221b (2026-03-08)

File	Guessed path	Actual path
leaves.rs	src/ui/components/leaves.rs	src/animation/leaves.rs	✗
fog.rs	src/weather/fog.rs	src/animation/fog.rs	✗
types.rs	src/types.rs	src/weather/types.rs	✗
units.rs	src/units.rs	src/weather/units.rs	✗
app.rs	src/app.rs	src/app.rs	✓
geolocation.rs	src/geolocation.rs	src/geolocation.rs	✓
open_meteo.rs	src/apis/open_meteo.rs	src/weather/provider/open_meteo.rs	✗
snow.gif	resources/images/snow.gif	docs/snow.gif	✗
airplanes.rs	src/data/airplanes.rs	src/animation/airplanes.rs	✗
raindrops.rs	src/audio/raindrops.rs	src/animation/raindrops.rs	✗

bgreenwell/xleak — commit a07bd4c (2025-12-06)

File	Guessed path	Actual path
main.rs	src/main.rs	src/main.rs	✓
tui.rs	src/tui.rs	src/tui.rs	✓
display.rs	src/display.rs	src/display.rs	✓
README.md	vignettes/README.md	tests/fixtures/README.md	✗
generate_test_tables.py	tests/testthat/generate_test_tables.py	tests/fixtures/generate_test_tables.py	✗
generate_all_tests.py	tests/generate_all_tests.py	tests/fixtures/generate_all_tests.py	✗
main.wxs	tests/testthat/xleak/main.wxs	wix/main.wxs	✗
demo.tape	vignettes/demo.tape	assets/demo.tape	✗
generate_test_large.py	inst/testfiles/generate_test_large.py	tests/fixtures/generate_test_large.py	✗
demo.gif	inst/images/demo.gif	assets/demo.gif	✗

buyukakyuz/install-nothing — commit f8cdcde (2025-12-20)

File	Guessed path	Actual path
initramfs.rs	src/initramfs.rs	src/stages/initramfs.rs	✗
container.rs	src/container.rs	src/stages/container.rs	✗
bios.rs	src/boot/bios.rs	src/stages/bios.rs	✗
ai.rs	src/ai.rs	src/stages/ai.rs	✗
packages.rs	src/packages.rs	src/stages/packages.rs	✗
main.rs	src/main.rs	src/main.rs	✓
kernel.rs	src/kernel.rs	src/stages/kernel.rs	✗
log_generator.rs	src/log_generator.rs	src/log_generator.rs	✓
xorg.rs	src/xorg.rs	src/stages/xorg.rs	✗
installer.rs	src/installer.rs	src/installer.rs	✓

googleworkspace/cli — commit 1308786 (2026-03-13)

File	Guessed path	Actual path
vhs.md	docs/vhs.md	.agent/skills/vhs.md	✗
scene6.txt	docs/story/scene6.txt	art/scene6.txt	✗
config.json	src/tools/shared/config.json	.changeset/config.json	✗
error.rs	src/error.rs	src/error.rs	✓
recipes.yaml	config/recipes/recipes.yaml	registry/recipes.yaml	✗
script.rs	src/script.rs	src/helpers/script.rs	✗
generate_skills.rs	tools/generate_skills.rs	src/generate_skills.rs	✗
sheets.rs	src/sheets.rs	src/helpers/sheets.rs	✗
validate.rs	src/commands/validate.rs	src/validate.rs	✗
forward.rs	src/commands/gmail/forward.rs	src/helpers/gmail/forward.rs	✗

njbrake/agent-of-empires — commit 778eb8b (2026-03-13)

File	Guessed path	Actual path
Step.astro	src/components/Step.astro	website/src/components/Step.astro	✗
v004_unified_environment.rs	src/unified_environment/v004_unified_environment.rs	src/migrations/v004_unified_environment.rs	✗
groups.rs	src/strategy/groups.rs	src/session/groups.rs	✗
main.js	src/main.js	website/public/main.js	✗
status_poller.rs	src/status_poller.rs	src/tui/status_poller.rs	✗
config.rs	src/config.rs	src/session/config.rs	✗
social-preview.png	public/social-preview.png	assets/social-preview.png	✗
config-schema.md	docs/config-schema.md	specs/002-hooks-settings-tui/contracts/config-schema.md	✗
tui_launch.rs	src/bin/tui_launch.rs	tests/e2e/tui_launch.rs	✗
constitution.md	docs/constitution.md	.specify/memory/constitution.md	✗

rtk-ai/rtk — commit 188ec99 (2026-03-12)

File	Guessed path	Actual path
gcc.toml	.github/workflows/gcc.toml	src/filters/gcc.toml	✗
filter.rs	src/filter.rs	src/filter.rs	✓
rewrite_cmd.rs	src/commands/rewrite_cmd.rs	src/rewrite_cmd.rs	✗
ssh.toml	config/ssh.toml	src/filters/ssh.toml	✗
gcloud.toml	deploy/config/gcloud.toml	src/filters/gcloud.toml	✗
read.rs	crates/rtk-schema/src/read.rs	src/read.rs	✗
ccusage.rs	src/ccusage.rs	src/ccusage.rs	✓
basedpyright.toml	No such file in a subfolder.	src/filters/basedpyright.toml	✗
repo-recap.md	docs/repo-recap.md	.claude/skills/repo-recap.md	✗
yamllint.toml	.github/yamllint.toml	src/filters/yamllint.toml	✗

sheeki03/tirith — commit 352d861 (2026-03-12)

File	Guessed path	Actual path
tirith.toml	config/tirith.toml	packaging/mise/tirith.toml	✗
shell_weirdness.toml	tests/fixtures/shell_weirdness.toml	tests/fixtures/shell_weirdness.toml	✓
fetch.rs	src/peer/fetch.rs	crates/tirith/src/cli/fetch.rs	✗
init.rs	src/init.rs	crates/tirith/src/cli/init.rs	✗
mcp_server.rs	src/mcp_server.rs	crates/tirith/src/cli/mcp_server.rs	✗
confusables.rs	src/unicode/confusables.rs	crates/tirith-core/src/confusables.rs	✗
configfile.toml	config/configfile.toml	tests/fixtures/configfile.toml	✗
run.rs	src/instance/run.rs	crates/tirith/src/cli/run.rs	✗
redact.rs	src/redact.rs	crates/tirith-core/src/redact.rs	✗
url_validate.rs	crates/common/src/url_validate.rs	crates/tirith-core/src/url_validate.rs	✗

unhappychoice/gitlogue — commit 4477ef2 (2026-03-13)

File	Guessed path	Actual path
nord.rs	src/colors/nord.rs	src/theme/themes/nord.rs	✗
yaml.rs	src/format/yaml.rs	src/syntax/languages/yaml.rs	✗
demo.mp4	demo/demo.mp4	docs/assets/demo.mp4	✗
themes.md	docs/themes.md	docs/themes.md	✓
screenshot-editor.png	docs/assets/screenshot-editor.png	docs/assets/screenshot-editor.png	✓
terminal.rs	src/ui/terminal.rs	src/panes/terminal.rs	✗
kotlin.rs	src/parsing/parsers/kotlin.rs	src/syntax/languages/kotlin.rs	✗
swift_highlights.scm	queries/swift_highlights.scm	src/syntax/languages/queries/swift_highlights.scm	✗
specification.md	docs/specification.md	docs/specification.md	✓
dart_highlights.scm	queries/highlights/dart_highlights.scm	src/syntax/languages/queries/dart_highlights.scm	✗

Appendix B — Class/Method Name Test: Raw Results

Before cutoff

BurntSushi/ripgrep

File	Guess	Verdict
crates/searcher/src/searcher/mod.rs	Searcher	PASS
crates/cli/src/wtr.rs	Printer	FAIL
crates/regex/src/ast.rs	AstPrinter	FAIL
crates/grep/examples/simplegrep.rs	run	PASS
tests/multiline.rs	f109_braces_fail_no_exponential_blowup_lf	FAIL

Orange-OpenSource/hurl

File	Guess	Verdict
integration/hurl/tests_ok/hello/hello_gb2312.py	HelloGB2312Handler	FAIL
integration/hurl/tests_failed/runner_errors/runner_errors.py	runner_errors	FAIL
bin/spec/options/generate_completion.py	parser_completion	PASS
packages/hurl_core/src/types.rs	StatKeyValuePair	FAIL
packages/hurl/src/runner/filter/utf8_encode.rs	utf8_encode	PASS

Rigellute/spotify-tui

File	Guess	Verdict
src/handlers/mod.rs	update_playback_progress	FAIL
src/handlers/input.rs	new_key_event	FAIL
src/handlers/track_table.rs	TrackTableState	FAIL
src/user_config.rs	UserConfig	PASS
src/config.rs	ConfigValues	FAIL

ajeetdsouza/zoxide

File	Guess	Verdict
src/db/stream.rs	StreamDb	FAIL
src/cmd/edit.rs	edit	PASS
src/config.rs	config_dir	FAIL
src/shell.rs	Shell	FAIL
src/db/mod.rs	open	PASS

jj-vcs/jj

File	Guess	Verdict
cli/src/commands/git/colocation.rs	cmd_git_colocate	PASS
cli/examples/custom-working-copy/main.rs	create_custom_working_copy	FAIL
cli/src/commands/operation/log.rs	cmd_operation_log	FAIL
lib/testutils/src/lib.rs	new_temp_dir	PASS
lib/tests/test_id_prefix.rs	test_id_prefix_context_commits_synced_中文字幕	FAIL

ratatui/ratatui

File	Guess	Verdict
ratatui-core/src/terminal.rs	Terminal	PASS
examples/apps/demo2/src/colors.rs	tailwind_palette	FAIL
ratatui-termwiz/src/lib.rs	RatatuiTerminal	FAIL
examples/apps/constraints/src/main.rs	flex	FAIL
ratatui-widgets/examples/canvas.rs	MapResolution	PASS

sharkdp/bat

File	Guess	Verdict
src/bin/bat/assets.rs	build_assets	PASS
build/application.rs	App	FAIL
src/assets/build_assets.rs	build_assets	PASS
tests/syntax-tests/highlighted/Cpp/test.cpp	SubstitutePiece	FAIL
src/syntax_mapping/ignored_suffixes.rs	ignored_suffix_for_path	FAIL

sharkdp/fd

File	Guess	Verdict
tests/testenv/mod.rs	dir_inside_tmp	FAIL
src/cli.rs	parse_pattern	FAIL
src/regex_helper.rs	pattern_has_uppercase_char】【。json	PASS
src/filter/size.rs	SizeFilter	PASS
src/exec/job.rs	JobQueue	FAIL

sharkdp/hyperfine

File	Guess	Verdict
src/options.rs	build_hyperfine_options_parser	FAIL
src/export/asciidoc.rs	asciidoc_table	FAIL
src/output/format.rs	BenchmarkResultWithRelativeSpeed	FAIL
src/output/progress_bar.rs	ProgressBar	PASS
src/outlier_detection.rs	modified_zscores	PASS

sxyazi/yazi

File	Guess	Verdict
yazi-ffi/src/cf_dict.rs	CFDict	PASS
yazi-proxy/src/which.rs	which	PASS
yazi-dds/src/ember/duplicate.rs	purge_duplicates	FAIL
yazi-scheduler/src/ongoing.rs	Ongoing	PASS
yazi-shim/src/twox.rs	XxHash3_64	FAIL

After cutoff

1jehuang/mermaid-rs-renderer

File	Guess	Verdict
src/layout/radar.rs	RadarLayout	FAIL
src/layout/ranking.rs	crossing_reduction	FAIL
src/cli.rs	parse_args	FAIL
src/layout/error.rs	LayoutError	FAIL
scripts/layout_score.py	rank_layouts	FAIL

Veirt/weathr

File	Guess	Verdict
src/render/capabilities.rs	PointQueryExt	FAIL
src/cli.rs	Options	FAIL
src/app.rs	render_weather	FAIL
src/animation/mod.rs	Animation	PASS
src/error.rs	WeatherError	PASS

bgreenwell/xleak

File	Guess	Verdict
src/config.rs	Config	PASS
src/workbook.rs	Workbook	PASS
src/main.rs	main	FAIL
tests/fixtures/generate_test_large.py	generate_test_large	FAIL
tests/fixtures/generate_test_tables.py	generate_test_tables	FAIL

buyukakyuz/install-nothing

File	Guess	Verdict
src/log_generator.rs	LogGenerator	PASS
src/stages/system.rs	stage_system	FAIL
src/stages/bootloader.rs	get_disks	FAIL
src/ui/spinner.rs	Spinner	PASS
src/stages/mod.rs	initialize	FAIL

googleworkspace/cli

File	Guess	Verdict
src/helpers/mod.rs	UserInfo	FAIL
src/error.rs	Error	FAIL
src/helpers/events/renew.rs	renew_event	FAIL
src/helpers/gmail/watch.rs	parse_watch_response	FAIL
src/text.rs	ellipsisize	FAIL

njbrake/agent-of-empires

File	Guess	Verdict
tests/e2e/tui_launch.rs	spawn_tui	PASS
tests/diff_integration.rs	test_diff_compose	FAIL
src/cli/status.rs	StatusCommand	FAIL
src/session/storage.rs	MemoryStorage	FAIL
src/tui/creation_poller.rs	CreationPoller	PASS

rtk-ai/rtk

File	Guess	Verdict
src/cargo_cmd.rs	cargo_cmd	PASS
src/session_cmd.rs	SessionCmd	PASS
src/dotnet_cmd.rs	dotnet_help	FAIL
src/learn/mod.rs	LearnStatus	FAIL
src/discover/mod.rs	discover	FAIL

sheeki03/tirith

File	Guess	Verdict
crates/tirith-core/src/mcp/resources.rs	ProjectRoot	FAIL
tools/license-server/src/main.rs	run	FAIL
tools/license-server/src/routes/mod.rs	health_check	FAIL
crates/tirith/src/cli/setup/fs_helpers.rs	create_setup_dirs	FAIL
crates/tirith-core/src/rules/terminal.rs	TerminalRule	FAIL

unhappychoice/gitlogue

File	Guess	Verdict
src/animation.rs	parse_animation_keyframes	FAIL
src/ui.rs	try_get_color	FAIL
src/widgets/selectable_paragraph.rs	SelectableParagraph	PASS
src/main.rs	expand_history_line_randomly	FAIL
src/panes/status_bar.rs	StatusBarPane	PASS

1. Introduction

2. Data and Methods

3. Tasks and Evaluation

Task 1: File Path Localization

Task 2: Non-Trivial Symbol Identification

Stage 1 — Triviality Filtering

Stage 2 — Symbol Elicitation

Stage 3 — LLM-as-Judge Verification

4. Results and Discussion

Task 1: File Path Localization

Task 2: Non-Trivial Symbol Identification

5. Conclusion

6. References

Appendix A — File Path Test: Raw Results

Before cutoff

After cutoff

Appendix B — Class/Method Name Test: Raw Results

Before cutoff

After cutoff