LaTeX Reference Checker: Bash Script to Find Unused Labels and Missing Citations
A robust bash script that audits LaTeX documents to find unreferenced figures, tables, equations, and citation issues while ignoring commented code.
When preparing research manuscripts, it is common to accumulate unused labels, orphaned citations, and commented-out content. This bash script provides a comprehensive audit of your LaTeX document, identifying unreferenced figures, tables, equations, and sections, unused bibliography entries, and missing citations – all while properly ignoring commented lines. The script preserves the sequence in which labels appear in your document, making it easier to locate and manage them.
The Problem
During manuscript development, you might encounter:
- Unreferenced labels —
\label{fig:old_analysis}defined but never cited; clutters the document and confuses reviewers. - Commented labels counted —
% \label{tab:removed}still detected by naive checks; produces false positives. - Missing citations —
\cite{smith2023}used in the text but absent from the.bibfile; causes compilation errors. - Unused bibliography entries — references added but never cited; inflates the reference count.
Manual checking across 50+ pages with multiple revisions becomes impractical.
The Solution
Features
- Comment-aware: Ignores both full-line and inline comments
- Sequence-preserving: Shows labels in document order
- Comprehensive: Checks figures, tables, equations, sections, and citations
- All major reference commands: Supports all standard variants —
\ref,\autoref,\Autoref,\cref,\Cref,\vref,\pageref,\nameref,\labelcref,\eqref - Multi-key refs: Handles
\cref{fig:a,fig:b}style calls (cleveref) - biblatex support: Detects
\addbibresource{}in addition to\bibliography{} - Broad citation coverage: Covers natbib (
\cite,\citep,\citet,\citealt,\citealp,\citeauthor,\citeyear) and biblatex (\parencite,\textcite,\footcite,\autocite) including starred variants - Lightweight: Pure bash, no dependencies
- Fast: Processes typical manuscripts in less than one second
Installation and Usage
Follow these three simple steps to start using the reference checker:
- Download the script - Download
check_latex_refs.sh(link below) and save it in your LaTeX project directory - Make it executable - Run
chmod +x check_latex_refs.shin your terminal - Run the checker - Execute
./check_latex_refs.sh manuscript.tex(replace with your filename)
Step 1: Download the Script
Save the file as check_latex_refs.sh in the same directory as your LaTeX file.
Step 2: Make it Executable
Open your terminal (Bash/CMD/PowerShell), navigate to your project directory, and run:
1
chmod +x check_latex_refs.sh
This command gives the script permission to run on your system.
Step 3: Run the Checker
Execute the script with your LaTeX file:
1
./check_latex_refs.sh manuscript.tex
Replace manuscript.tex with your actual LaTeX filename. The script will analyze your document and display a comprehensive report of all references.
Example Output
Here is what the script reports for a typical manuscript:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
Checking references in: paper.tex
========================================
=== TABLES ===
Total labels: 11
Total refs: 8
Unreferenced tables:
tab:appendix_data
tab:extra_results
tab:summary_stats
=== FIGURES ===
Total labels: 6
Total refs: 6
Unreferenced figures:
=== EQUATIONS ===
Total labels: 15
Total refs: 12
Unreferenced equations:
eq:supplementary
eq:variance
eq:alt_form
=== SECTIONS ===
Total labels: 4
Total refs: 3
Unreferenced sections:
sec:future_work
=== CITATIONS ===
Using bibliography file: references.bib
Total bib entries: 45
Unique cited keys: 38
Uncited bibliography entries:
smith2020old
jones2019unused
brown2018extra
Missing bibliography entries (cited but not in .bib):
nguyen2023missing
========================================
Check complete!
The script detects references made with \ref{fig:one}, \autoref{tab:results}, \cref{eq:main}, \eqref{eq:energy}, or even \cref{fig:a,fig:b} – all are counted correctly.
Technical Deep Dive
How It Works
1. Comment Removal
The script uses a two-step approach to handle comments:
1
2
3
# Step 1: Remove lines starting with %
# Step 2: Remove inline comments but preserve \%
TEXCONTENT=$(grep -v '^\s*%' "$TEXFILE" | sed 's/\([^\\]\)%.*$/\1/')
The order matters. Full-line comments must be removed first, then inline comments.
Test cases:
| LaTeX Code | Processed As |
|---|---|
\label{fig:test} |
Included |
% \label{fig:old} |
Excluded |
Text \ref{fig:a} % comment |
\ref{fig:a} included |
50\% efficiency |
\% preserved |
2. Sequence Preservation
Uses awk '!seen[$0]++' to remove duplicates while maintaining order:
1
2
3
4
5
# Traditional approach (loses order)
sort -u
# Our approach (preserves order)
awk '!seen[$0]++'
Why this matters:
If your document has:
1
2
3
4
5
6
7
8
\section{Methods} % Line 50
\label{fig:workflow} % Line 65
\section{Results} % Line 150
\label{fig:accuracy} % Line 170
\section{Appendix} % Line 300
\label{fig:extra} % Line 310 (unreferenced)
The output shows fig:extra in document order (not alphabetically sorted), making it easier to locate.
3. Unified Reference Pattern
All reference commands are captured in one pass using a single regex, then split by prefix:
1
2
3
4
5
6
7
8
ALL_REFS=$(echo "$TEXCONTENT" \
| grep -oP '\\(auto|Auto|c|C|v|page|name|labelc|eq)?ref\{[^}]+\}' \
| grep -oP '\{[^}]+\}' \
| tr -d '{}' \
| tr ',' '\n' \ # split multi-key: \cref{fig:a,fig:b}
| sed 's/^[[:space:]]*//' \
| grep -v '^$' \
| awk '!seen[$0]++')
Labels are then filtered per section using a helper:
1
2
3
4
refs_for_prefix() {
local prefix="$1"
echo "$ALL_REFS" | grep "^${prefix}:" | sed "s/^${prefix}://" | awk '!seen[$0]++'
}
Supported reference commands:
| Command | Package | Notes |
|---|---|---|
\ref{...} |
Standard LaTeX | Basic cross-reference |
\autoref{...} |
hyperref | Adds type name automatically |
\Autoref{...} |
hyperref | Sentence-initial capitalised form |
\cref{...} |
cleveref | Smart reference with type |
\Cref{...} |
cleveref | Capitalised form |
\vref{...} |
varioref | Adds page information |
\pageref{...} |
Standard LaTeX | Page number only |
\nameref{...} |
hyperref | Uses section name |
\labelcref{...} |
cleveref | Label-only reference |
\eqref{...} |
amsmath | Equation reference with parentheses |
Multi-key cleveref calls are handled automatically:
1
2
% All three labels are detected as referenced:
\cref{fig:workflow,fig:accuracy,tab:results}
Examples:
Labels detected:
\label{tab:results}extractsresults\label{fig:analysis_2023}extractsanalysis_2023\label{eq:main_theorem}extractsmain_theorem\label{sec:introduction}extractsintroduction
References detected:
\ref{fig:diagram}→ matchesdiagram\autoref{tab:results}→ matchesresults\Cref{fig:workflow}→ matchesworkflow\eqref{eq:einstein}→ matcheseinstein\cref{fig:a,fig:b}→ matches bothaandb
Not matched (correctly excluded):
% \label{tab:old}(filtered by comment removal)
4. Citation Coverage
The citation section captures all common natbib and biblatex commands, including multi-key calls:
1
2
3
4
5
6
7
8
NATBIB='\\cite(p|t|alt|alp|author|year|yearpar|num|s)?\*?\{[^}]+\}'
BIBLATEX='\\(parencite|textcite|footcite|autocite)\*?\{[^}]+\}'
CITATIONS=$(echo "$TEXCONTENT" \
| grep -oP "$NATBIB|$BIBLATEX" \
| grep -oP '\{[^}]+\}' \
| tr -d '{}' \
| tr ',' '\n' \
| ...)
Supported citation commands:
| Command | Package |
|---|---|
\cite, \citep, \citet |
natbib |
\citealt, \citealp |
natbib |
\citeauthor, \citeyear, \citeyearpar |
natbib |
\citenum, \cites |
natbib / misc |
\parencite, \textcite |
biblatex |
\footcite, \autocite |
biblatex |
Starred variants (\citep* etc.) |
natbib |
Bibliography detection supports both:
\bibliography{references}(natbib / BibTeX)\addbibresource{references.bib}(biblatex)
Advanced Usage
1. Check Multiple Files
Create check_all.sh:
1
2
3
4
5
6
7
#!/bin/bash
for file in *.tex; do
echo "File: $file"
./check_latex_refs.sh "$file"
echo ""
done
2. Generate Timestamped Reports
1
2
3
4
5
6
7
8
# Create dated log
./check_latex_refs.sh paper.tex > "audit_$(date +%Y%m%d_%H%M%S).log"
# Compare before/after revision
./check_latex_refs.sh paper.tex > before_revision.log
# ... make changes ...
./check_latex_refs.sh paper.tex > after_revision.log
diff before_revision.log after_revision.log
3. Git Pre-commit Hook
Download and copy to .git/hooks/pre-commit, then run chmod +x .git/hooks/pre-commit:
Download latex-pre-commit-hook.sh
4. Makefile Integration
Download and save as Makefile in your project directory:
Usage:
1
2
3
make # Compile and check
make check # Run reference check only
make audit # Generate timestamped audit
5. Pre-submission Checklist Script
Download and run ./pre-submit.sh from your project directory:
Extensions and Customizations
Add Line Numbers
Modify the unreferenced item display to show line numbers:
1
2
3
4
5
6
while IFS= read -r label; do
if ! echo "$TAB_REFS" | grep -qx "$label"; then
LINE=$(grep -n "\\label{tab:$label}" "$TEXFILE" | cut -d: -f1)
echo " tab:$label (line $LINE)"
fi
done <<< "$TAB_LABELS_UNIQ"
Output:
1
2
3
Unreferenced tables:
tab:appendix_data (line 145)
tab:extra_results (line 203)
Color Output
Add ANSI color codes for better visibility:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# Add at top of script (after shebang)
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# Use in output
echo -e "${BLUE}=== TABLES ===${NC}"
echo "Total labels: $TAB_COUNT"
echo "Total refs: $TABREF_COUNT"
if [ "$TAB_COUNT" -eq "$TABREF_COUNT" ]; then
echo -e "${GREEN}All tables referenced${NC}"
else
echo -e "${RED}Unreferenced tables:${NC}"
while IFS= read -r label; do
if ! echo "$TAB_REFS" | grep -qx "$label"; then
echo -e " ${YELLOW}tab:$label${NC}"
fi
done <<< "$TAB_LABELS_UNIQ"
fi
Export to CSV
1
./latex-refs-to-csv.sh manuscript.tex
JSON Output
Download latex-refs-to-json.sh
1
./latex-refs-to-json.sh manuscript.tex
Workflow Integration Examples
1. Continuous Integration (CI)
Download and place at .github/workflows/latex-check.yml in your repository:
2. VS Code Task
Download and place at .vscode/tasks.json in your project:
3. Overleaf/Git Sync
Download and run ./sync-and-check.sh from your project directory:
Comparison with Alternatives
- This script
- Fast, comment-aware, preserves label order, broad ref/cite command support
- ⚠ Single file only
- Best for: quick audits, CI/CD pipelines
refcheckpackage- LaTeX-integrated; adds visual markers in the PDF margin
- ⚠ Requires recompilation; clutters the PDF output
- Best for: checking during active writing
chktex- Comprehensive linting with many rule-based checks
- ⚠ Verbose output; steep learning curve
- Best for: deep style and syntax analysis
- VS Code LaTeX Workshop
- Real-time feedback integrated directly in the editor
- ⚠ Editor-specific; no batch or CI support
- Best for: active editing sessions
- Python scripts
- Highly flexible; can produce HTML reports and custom output
- ⚠ Slower; requires a Python environment
- Best for: custom or complex workflows
Real-World Case Study
Initial State
Manuscript for journal submission with 8 months of revisions:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
=== TABLES ===
Total labels: 8
Total refs: 7
Unreferenced tables:
tab:correlation_matrix
=== FIGURES ===
Total labels: 12
Total refs: 12
=== EQUATIONS ===
Total labels: 6
Total refs: 6
=== CITATIONS ===
Total bib entries: 67
Uncited bibliography entries:
lecun2015deep
goodfellow2016deep
chollet2017deep
bishop2006pattern
murphy2012machine
hastie2009elements
james2013introduction
vapnik1995nature
cover2006elements
Actions Taken
- Moved
tab:correlation_matrixto supplementary materials - Removed 9 unused deep learning references (leftover from earlier drafts)
- Cleaned up bibliography from 67 to 58 entries
- Verified all remaining citations were relevant
Final State
1
2
3
4
5
6
7
8
=== TABLES ===
Total labels: 7
Total refs: 7
Unreferenced tables:
=== CITATIONS ===
Total bib entries: 58
Uncited bibliography entries:
Time saved: Approximately 30 minutes of manual checking Outcome: Clean submission, no reviewer comments about references
Limitations and Workarounds
Current Limitations
- Single file only: Does not traverse
\input{}or\include{}commands - Standard prefixes: Assumes
tab:,fig:,eq:,sec:conventions - Citation commands: Does not cover every possible custom cite command
Workarounds
For multi-file projects:
1
2
3
4
# Combine files first
cat main.tex chapter*.tex appendix.tex > combined.tex
./check_latex_refs.sh combined.tex
rm combined.tex
For custom prefixes:
1
2
3
# Modify the refs_for_prefix calls in the script
# Example: change tab: to tbl:
TAB_REFS=$(refs_for_prefix "tbl")
For additional citation commands:
Open check_latex_refs.sh and extend the CITATIONS grep pattern:
1
2
# Add your custom command, e.g. \mycite
grep -oP '\\(cite...|mycite)\*?\{[^}]+\}'
Troubleshooting
Issue: Script shows no output
Solution:
1
2
3
4
5
6
7
8
9
# Check file exists and has content
ls -la manuscript.tex
wc -l manuscript.tex
# Check for labels
grep "\\label{" manuscript.tex | head -5
# Run with debug mode
bash -x check_latex_refs.sh manuscript.tex
Issue: Bibliography not found
Solution:
1
2
3
4
5
6
7
8
# Check bibliography command exists
grep "bibliography" manuscript.tex
# Verify .bib file location
ls -la *.bib
# Check file permissions
ls -la references.bib
Issue: Commented lines still appear
Solution:
1
2
3
4
5
6
7
8
# Check sed version
sed --version
# Try alternative comment removal
TEXCONTENT=$(grep -v '^\s*%' "$TEXFILE" | sed 's/[^\\]%.*$//')
# Or use Perl
TEXCONTENT=$(perl -pe 's/([^\\])%.*$/$1/' "$TEXFILE")
Issue: Special characters in labels
Solution:
1
2
3
4
5
# Escape special characters in grep
grep -oP '\\label\{tab:\K[^}]+' | sed 's/[._-]//g'
# Or use simpler pattern
grep "\\label{tab:" | cut -d'{' -f2 | cut -d'}' -f1
Issue: False positives
Solution:
1
2
3
4
5
# Check if label actually exists in source
grep -n "label{tab:problematic}" manuscript.tex
# Verify it's not in a comment
grep "tab:problematic" manuscript.tex | grep -v "^%"
Performance Optimization
For very large documents (>10,000 lines):
1
2
3
4
5
6
7
8
9
# Use parallel processing for multiple files
find . -name "*.tex" | parallel ./check_latex_refs.sh {}
# Cache intermediate results
TEXCONTENT=$(sed 's/\([^\\]\)%.*$/\1/' "$TEXFILE" | tee /tmp/cleaned.tex)
# Use faster grep alternatives
# Install ripgrep: sudo apt install ripgrep
rg -oP '\\label\{tab:\K[^}]+' "$TEXFILE"
How to Cite
Samad, M. A. (2026). LaTeX reference checker: Bash script to find unused labels and missing citations. ScholarsNote. https://www.scholarsnote.org/posts/latex-label-prefix-guide-blog/
BibTeX:
1
2
3
4
5
6
7
8
9
10
@misc{samad2026latexrefchecker,
author = {Samad, Md Abdus},
title = {LaTeX Reference Checker: Bash Script to Find
Unused Labels and Missing Citations},
year = {2026},
month = jan,
howpublished = {ScholarsNote},
url = {https://www.scholarsnote.org/posts/latex-label-prefix-guide-blog/},
note = {Accessed: 2026-01-03}
}