LaTeX Reference Checker: Bash Script to Find Unused Labels and Missing Citations
A robust bash script that audits LaTeX documents to find unreferenced figures, tables, equations, and citation issues while ignoring commented code.
When preparing research manuscripts, it is common to accumulate unused labels, orphaned citations, and commented-out content. This bash script provides a comprehensive audit of your LaTeX document, identifying unreferenced figures, tables, and equations, unused bibliography entries, and missing citations – all while properly ignoring commented lines. The script preserves the sequence in which labels appear in your document, making it easier to locate and manage them.
The Problem
During manuscript development, you might encounter:
| Issue | Example | Impact |
|---|---|---|
| Unreferenced labels | \label{fig:old_analysis} never used |
Clutters document, confuses reviewers |
| Commented labels counted | % \label{tab:removed} still detected |
False positives in checks |
| Missing citations | \cite{smith2023} but not in .bib |
Compilation errors |
| Unused bibliography | References added but never cited | Inflates reference count |
Manual checking across 50+ pages with multiple revisions becomes impractical.
The Solution
Features
- Comment-aware: Ignores both full-line and inline comments
- Sequence-preserving: Shows labels in document order
- Comprehensive: Checks figures, tables, equations, and citations
- Multiple reference styles: Supports
\ref,\autoref, and\eqref - Lightweight: Pure bash, no dependencies
- Fast: Processes typical manuscripts in less than one second
Installation and Usage
Follow these three simple steps to start using the reference checker:
- Save the script - Copy the code below and save it as
check_latex_refs.shin your LaTeX project directory - Make it executable - Run
chmod +x check_latex_refs.shin your terminal - Run the checker - Execute
./check_latex_refs.sh manuscript.tex(replace with your filename)
Step 1: Create the Script
Copy the following code and save it as check_latex_refs.sh in the same directory as your LaTeX file:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
#!/bin/bash
if [ $# -eq 0 ]; then
echo "Usage: $0 <latex_file.tex>"
exit 1
fi
TEXFILE=$1
# Remove comments in two steps:
# Step 1: Remove lines that start with % (including whitespace before %)
# Step 2: Remove inline comments (everything after % that's not \%)
TEXCONTENT=$(grep -v '^\s*%' "$TEXFILE" | sed 's/\([^\\]\)%.*$/\1/')
echo "Checking references in: $TEXFILE"
echo "========================================"
# Tables (preserving order, excluding comments)
echo -e "\n=== TABLES ==="
TAB_LABELS=$(echo "$TEXCONTENT" | grep -oP '\\label\{tab:\K[^}]+')
TAB_LABELS_UNIQ=$(echo "$TAB_LABELS" | awk '!seen[$0]++')
# Catches both \ref and \autoref
TAB_REFS=$(echo "$TEXCONTENT" | grep -oP '\\(auto)?ref\{tab:\K[^}]+' | awk '!seen[$0]++')
TAB_COUNT=$(echo "$TAB_LABELS_UNIQ" | grep -v '^$' | wc -l)
REF_COUNT=$(echo "$TAB_REFS" | grep -v '^$' | wc -l)
echo "Total labels: $TAB_COUNT"
echo "Total refs: $REF_COUNT"
echo "Unreferenced tables:"
while IFS= read -r label; do
if ! echo "$TAB_REFS" | grep -qx "$label"; then
echo " tab:$label"
fi
done <<< "$TAB_LABELS_UNIQ"
# Figures (preserving order, excluding comments)
echo -e "\n=== FIGURES ==="
FIG_LABELS=$(echo "$TEXCONTENT" | grep -oP '\\label\{fig:\K[^}]+')
FIG_LABELS_UNIQ=$(echo "$FIG_LABELS" | awk '!seen[$0]++')
# Catches both \ref and \autoref
FIG_REFS=$(echo "$TEXCONTENT" | grep -oP '\\(auto)?ref\{fig:\K[^}]+' | awk '!seen[$0]++')
FIG_COUNT=$(echo "$FIG_LABELS_UNIQ" | grep -v '^$' | wc -l)
FIGREF_COUNT=$(echo "$FIG_REFS" | grep -v '^$' | wc -l)
echo "Total labels: $FIG_COUNT"
echo "Total refs: $FIGREF_COUNT"
echo "Unreferenced figures:"
while IFS= read -r label; do
if ! echo "$FIG_REFS" | grep -qx "$label"; then
echo " fig:$label"
fi
done <<< "$FIG_LABELS_UNIQ"
# Equations (preserving order, excluding comments)
echo -e "\n=== EQUATIONS ==="
EQ_LABELS=$(echo "$TEXCONTENT" | grep -oP '\\label\{eq:\K[^}]+')
EQ_LABELS_UNIQ=$(echo "$EQ_LABELS" | awk '!seen[$0]++')
# Catches \ref, \autoref, and \eqref
EQ_REFS=$(echo "$TEXCONTENT" | grep -oP '\\((auto|eq))?ref\{eq:\K[^}]+' | awk '!seen[$0]++')
EQ_COUNT=$(echo "$EQ_LABELS_UNIQ" | grep -v '^$' | wc -l)
EQREF_COUNT=$(echo "$EQ_REFS" | grep -v '^$' | wc -l)
echo "Total labels: $EQ_COUNT"
echo "Total refs: $EQREF_COUNT"
echo "Unreferenced equations:"
while IFS= read -r label; do
if ! echo "$EQ_REFS" | grep -qx "$label"; then
echo " eq:$label"
fi
done <<< "$EQ_LABELS_UNIQ"
# Citations (excluding comments)
echo -e "\n=== CITATIONS ==="
BIBFILE=$(echo "$TEXCONTENT" | grep -oP '\\bibliography\{\K[^}]+' | head -1)
if [ -n "$BIBFILE" ]; then
[[ "$BIBFILE" != *.bib ]] && BIBFILE="${BIBFILE}.bib"
if [ -f "$BIBFILE" ]; then
echo "Using bibliography file: $BIBFILE"
BIB_ENTRIES=$(grep -oP '@\w+\{\K[^,]+' "$BIBFILE" | awk '!seen[$0]++')
CITATIONS=$(echo "$TEXCONTENT" | grep -oP '\\cite[tp]?\{\K[^}]+' | tr ',' '\n' | sed 's/^[[:space:]]*//' | awk '!seen[$0]++')
BIB_COUNT=$(echo "$BIB_ENTRIES" | grep -v '^$' | wc -l)
CITE_TOTAL=$(echo "$TEXCONTENT" | grep -oP '\\cite[tp]?\{[^}]+\}' | wc -l)
CITE_UNIQ=$(echo "$CITATIONS" | grep -v '^$' | wc -l)
echo "Total bib entries: $BIB_COUNT"
echo "Total citation commands: $CITE_TOTAL"
echo "Unique cited keys: $CITE_UNIQ"
echo "Uncited bibliography entries:"
while IFS= read -r entry; do
if ! echo "$CITATIONS" | grep -qx "$entry"; then
echo " $entry"
fi
done <<< "$BIB_ENTRIES"
echo "Missing bibliography entries (cited but not in .bib):"
while IFS= read -r cite; do
if ! echo "$BIB_ENTRIES" | grep -qx "$cite"; then
echo " $cite"
fi
done <<< "$CITATIONS"
else
echo "Bibliography file not found: $BIBFILE"
fi
else
echo "No bibliography file found in document"
fi
echo -e "\n========================================"
echo "Check complete!"
Step 2: Make it Executable
Open your terminal (Bash/CMD/PowerShell), navigate to your project directory, and run:
1
chmod +x check_latex_refs.sh
This command gives the script permission to run on your system.
Step 3: Run the Checker
Execute the script with your LaTeX file:
1
./check_latex_refs.sh manuscript.tex
Replace manuscript.tex with your actual LaTeX filename. The script will analyze your document and display a comprehensive report of all references.
Example Output
Here is what the script reports for a typical manuscript:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
Checking references in: paper.tex
========================================
=== TABLES ===
Total labels: 11
Total refs: 8
Unreferenced tables:
tab:appendix_data
tab:extra_results
tab:summary_stats
=== FIGURES ===
Total labels: 6
Total refs: 6
Unreferenced figures:
=== EQUATIONS ===
Total labels: 15
Total refs: 12
Unreferenced equations:
eq:supplementary
eq:variance
eq:alt_form
=== CITATIONS ===
Using bibliography file: references.bib
Total bib entries: 45
Total citation commands: 52
Unique cited keys: 38
Uncited bibliography entries:
smith2020old
jones2019unused
brown2018extra
Missing bibliography entries (cited but not in .bib):
nguyen2023missing
========================================
Check complete!
The script detects references made with \ref{fig:one}, \autoref{tab:results}, or \eqref{eq:main} – all are counted correctly.
Technical Deep Dive
How It Works
1. Comment Removal
The script uses a two-step approach to handle comments:
1
2
3
# Step 1: Remove lines starting with %
# Step 2: Remove inline comments but preserve \%
TEXCONTENT=$(grep -v '^\s*%' "$TEXFILE" | sed 's/\([^\\]\)%.*$/\1/')
The order matters. Full-line comments must be removed first, then inline comments.
Test cases:
| LaTeX Code | Processed As |
|---|---|
\label{fig:test} |
Included |
% \label{fig:old} |
Excluded |
Text \ref{fig:a} % comment |
\ref{fig:a} included |
50\% efficiency |
\% preserved |
2. Sequence Preservation
Uses awk '!seen[$0]++' to remove duplicates while maintaining order:
1
2
3
4
5
# Traditional approach (loses order)
sort -u
# Our approach (preserves order)
awk '!seen[$0]++'
Why this matters:
If your document has:
1
2
3
4
5
6
7
8
\section{Methods} % Line 50
\label{fig:workflow} % Line 65
\section{Results} % Line 150
\label{fig:accuracy} % Line 170
\section{Appendix} % Line 300
\label{fig:extra} % Line 310 (unreferenced)
The output shows fig:extra in document order (not alphabetically sorted), making it easier to locate.
3. Pattern Matching
Uses Perl-compatible regex with grep -oP:
1
2
3
4
5
# For tables and figures: catches both \ref and \autoref
grep -oP '\\(auto)?ref\{tab:\K[^}]+'
# For equations: catches \ref, \autoref, and \eqref
grep -oP '\\((auto|eq))?ref\{eq:\K[^}]+'
Breakdown:
\\(auto)?ref\{tab:– Match\ref{tab:or\autoref{tab:\\((auto|eq))?ref\{eq:– Match\ref{eq:,\autoref{eq:, or\eqref{eq:\K– Discard everything matched so far[^}]+– Capture everything until}
Supported reference commands:
\ref{...}– Standard LaTeX reference\autoref{...}– Automatic reference from hyperref package\eqref{...}– Equation reference (for equations only)
Examples:
Labels detected:
\label{tab:results}extractsresults\label{fig:analysis_2023}extractsanalysis_2023\label{eq:main_theorem}extractsmain_theorem
References detected:
\ref{fig:diagram}matchesdiagram\autoref{tab:results}matchesresults\eqref{eq:einstein}matcheseinstein
Not matched:
% \label{tab:old}(filtered by comment removal)\label{sec:intro}(different prefix – use section extension)
Advanced Usage
1. Check Multiple Files
Create check_all.sh:
1
2
3
4
5
6
7
#!/bin/bash
for file in *.tex; do
echo "File: $file"
./check_latex_refs.sh "$file"
echo ""
done
2. Generate Timestamped Reports
1
2
3
4
5
6
7
8
# Create dated log
./check_latex_refs.sh paper.tex > "audit_$(date +%Y%m%d_%H%M%S).log"
# Compare before/after revision
./check_latex_refs.sh paper.tex > before_revision.log
# ... make changes ...
./check_latex_refs.sh paper.tex > after_revision.log
diff before_revision.log after_revision.log
3. Git Pre-commit Hook
Add to .git/hooks/pre-commit:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#!/bin/bash
# Run check
./check_latex_refs.sh main.tex > ref_check.log
# Count unreferenced items
UNREFERENCED=$(grep -c "^ " ref_check.log)
if [ $UNREFERENCED -gt 0 ]; then
echo "Warning: $UNREFERENCED unreferenced items found"
cat ref_check.log
read -p "Continue commit? (y/n) " -n 1 -r
echo
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
exit 1
fi
fi
4. Makefile Integration
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Makefile
.PHONY: check clean all
all: manuscript.pdf check
manuscript.pdf: manuscript.tex
pdflatex manuscript.tex
bibtex manuscript
pdflatex manuscript.tex
pdflatex manuscript.tex
check:
@./check_latex_refs.sh manuscript.tex
clean:
rm -f *.aux *.log *.bbl *.blg *.out *.toc
audit:
@./check_latex_refs.sh manuscript.tex > audit_$(shell date +%Y%m%d).log
@cat audit_$(shell date +%Y%m%d).log
Usage:
1
2
3
make # Compile and check
make check # Run reference check only
make audit # Generate timestamped audit
5. Pre-submission Checklist Script
Create pre_submit.sh:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#!/bin/bash
echo "Pre-submission Checklist"
echo "========================"
# 1. Reference check
echo "1. Checking references..."
./check_latex_refs.sh manuscript.tex > ref_audit.log
# 2. Count issues
ISSUES=$(grep -c "^ " ref_audit.log)
if [ $ISSUES -eq 0 ]; then
echo " No unreferenced items"
else
echo " $ISSUES unreferenced items found"
cat ref_audit.log
fi
# 3. Check compilation
echo "2. Checking compilation..."
pdflatex -interaction=nonstopmode manuscript.tex > /dev/null 2>&1
if [ $? -eq 0 ]; then
echo " Compiles successfully"
else
echo " Compilation errors found"
fi
# 4. Check bibliography
echo "3. Checking bibliography..."
bibtex manuscript > /dev/null 2>&1
echo " Bibliography processed"
echo "========================"
echo "Checklist complete!"
Extensions and Customizations
Add Line Numbers
Modify the unreferenced item display to show line numbers:
1
2
3
4
5
6
while IFS= read -r label; do
if ! echo "$TAB_REFS" | grep -qx "$label"; then
LINE=$(grep -n "\\label{tab:$label}" "$TEXFILE" | cut -d: -f1)
echo " tab:$label (line $LINE)"
fi
done <<< "$TAB_LABELS_UNIQ"
Output:
1
2
3
Unreferenced tables:
tab:appendix_data (line 145)
tab:extra_results (line 203)
Add Section Labels
Add this after the equations section:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# Sections (preserving order, excluding comments)
echo -e "\n=== SECTIONS ==="
SEC_LABELS=$(echo "$TEXCONTENT" | grep -oP '\\label\{sec:\K[^}]+')
SEC_LABELS_UNIQ=$(echo "$SEC_LABELS" | awk '!seen[$0]++')
SEC_REFS=$(echo "$TEXCONTENT" | grep -oP '\\ref\{sec:\K[^}]+' | awk '!seen[$0]++')
SEC_COUNT=$(echo "$SEC_LABELS_UNIQ" | grep -v '^$' | wc -l)
SECREF_COUNT=$(echo "$SEC_REFS" | grep -v '^$' | wc -l)
echo "Total section labels: $SEC_COUNT"
echo "Total section refs: $SECREF_COUNT"
echo "Unreferenced sections:"
while IFS= read -r label; do
if ! echo "$SEC_REFS" | grep -qx "$label"; then
echo " sec:$label"
fi
done <<< "$SEC_LABELS_UNIQ"
Color Output
Add ANSI color codes for better visibility:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# Add at top of script (after shebang)
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# Use in output
echo -e "${BLUE}=== TABLES ===${NC}"
echo "Total labels: $TAB_COUNT"
echo "Total refs: $REF_COUNT"
if [ "$TAB_COUNT" -eq "$REF_COUNT" ]; then
echo -e "${GREEN}All tables referenced${NC}"
else
echo -e "${RED}Unreferenced tables:${NC}"
while IFS= read -r label; do
if ! echo "$TAB_REFS" | grep -qx "$label"; then
echo -e " ${YELLOW}tab:$label${NC}"
fi
done <<< "$TAB_LABELS_UNIQ"
fi
Export to CSV
Create a CSV export function:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#!/bin/bash
# Export unreferenced items to CSV
TEXFILE=$1
OUTFILE="${TEXFILE%.tex}_unreferenced.csv"
# Run the check and extract unreferenced items
./check_latex_refs.sh "$TEXFILE" > /tmp/ref_check.log
# Create CSV header
echo "Type,Label,Section" > "$OUTFILE"
# Extract and format
grep "^ tab:" /tmp/ref_check.log | sed 's/^ /table,/' >> "$OUTFILE"
grep "^ fig:" /tmp/ref_check.log | sed 's/^ /figure,/' >> "$OUTFILE"
grep "^ eq:" /tmp/ref_check.log | sed 's/^ /equation,/' >> "$OUTFILE"
echo "CSV exported to: $OUTFILE"
JSON Output
Add JSON export capability:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#!/bin/bash
# Add --json flag support
if [ "$2" == "--json" ]; then
# Extract data
TAB_LABELS=$(echo "$TEXCONTENT" | grep -oP '\\label\{tab:\K[^}]+' | awk '!seen[$0]++')
TAB_REFS=$(echo "$TEXCONTENT" | grep -oP '\\ref\{tab:\K[^}]+' | awk '!seen[$0]++')
# Build unreferenced array
UNREF_TABS=$(comm -23 <(echo "$TAB_LABELS" | sort) <(echo "$TAB_REFS" | sort) | \
awk '{printf "\"%s\",", $0}' | sed 's/,$//')
# Output JSON
cat << EOF
{
"file": "$TEXFILE",
"timestamp": "$(date -Iseconds)",
"tables": {
"total_labels": $(echo "$TAB_LABELS" | grep -v '^$' | wc -l),
"total_refs": $(echo "$TAB_REFS" | grep -v '^$' | wc -l),
"unreferenced": [$UNREF_TABS]
}
}
EOF
fi
Workflow Integration Examples
1. Continuous Integration (CI)
For GitHub Actions (.github/workflows/latex-check.yml):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
name: LaTeX Reference Check
on: [push, pull_request]
jobs:
check-refs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Run reference checker
run: |
chmod +x check_latex_refs.sh
./check_latex_refs.sh manuscript.tex
- name: Count issues
run: |
ISSUES=$(./check_latex_refs.sh manuscript.tex | grep "^ " | wc -l)
echo "Found $ISSUES unreferenced items"
if [ $ISSUES -gt 5 ]; then
echo "Too many unreferenced items!"
exit 1
fi
2. VS Code Task
Add to .vscode/tasks.json:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
{
"version": "2.0.0",
"tasks": [
{
"label": "Check LaTeX References",
"type": "shell",
"command": "./check_latex_refs.sh",
"args": ["${file}"],
"problemMatcher": [],
"presentation": {
"reveal": "always",
"panel": "new"
}
}
]
}
3. Overleaf/Git Sync
For projects synced between Overleaf and Git:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#!/bin/bash
# sync_and_check.sh
# Pull from Overleaf
git pull origin master
# Run check
./check_latex_refs.sh main.tex > ref_check.log
# If clean, push changes
if [ $(grep -c "^ " ref_check.log) -eq 0 ]; then
git add .
git commit -m "Update: references checked"
git push origin master
else
echo "Unreferenced items found. Fix before pushing."
cat ref_check.log
fi
Comparison with Alternatives
| Tool | Pros | Cons | Best For |
|---|---|---|---|
| This script | Fast, customizable, comment-aware, preserves order | Single file only | Quick audits, CI/CD |
refcheck package |
LaTeX-integrated, visual markers | Must recompile, clutters PDF | During writing |
chktex |
Comprehensive linting, many checks | Verbose, complex output | Deep analysis |
| VS Code LaTeX Workshop | Real-time, editor-integrated | Editor-specific, no batch | Active editing |
| Python scripts | Very flexible, HTML reports | Slower, requires Python | Custom workflows |
Real-World Case Study
Initial State
Manuscript for journal submission with 8 months of revisions:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
=== TABLES ===
Total labels: 8
Total refs: 7
Unreferenced tables:
tab:correlation_matrix
=== FIGURES ===
Total labels: 12
Total refs: 12
=== EQUATIONS ===
Total labels: 6
Total refs: 6
=== CITATIONS ===
Total bib entries: 67
Uncited bibliography entries:
lecun2015deep
goodfellow2016deep
chollet2017deep
bishop2006pattern
murphy2012machine
hastie2009elements
james2013introduction
vapnik1995nature
cover2006elements
Actions Taken
- Moved
tab:correlation_matrixto supplementary materials - Removed 9 unused deep learning references (leftover from earlier drafts)
- Cleaned up bibliography from 67 to 58 entries
- Verified all remaining citations were relevant
Final State
1
2
3
4
5
6
7
8
=== TABLES ===
Total labels: 7
Total refs: 7
Unreferenced tables:
=== CITATIONS ===
Total bib entries: 58
Uncited bibliography entries:
Time saved: Approximately 30 minutes of manual checking Outcome: Clean submission, no reviewer comments about references
Limitations and Workarounds
Current Limitations
- Single file only: Does not traverse
\input{}or\include{}commands - Standard prefixes: Assumes
tab:,fig:,eq:,sec:conventions - Reference commands: Detects
\ref,\autoref, and\eqref(for equations) - Citation commands: Only detects
\cite,\citep,\citet - Bibliography format: Requires standard
\bibliography{}command
Workarounds
For multi-file projects:
1
2
3
4
# Combine files first
cat main.tex chapter*.tex appendix.tex > combined.tex
./check_latex_refs.sh combined.tex
rm combined.tex
For custom prefixes:
1
2
3
# Modify the grep patterns in the script
# Change tab: to tbl:
grep -oP '\\label\{tbl:\K[^}]+'
For additional citation commands:
1
2
# Add to CITATIONS line:
CITATIONS=$(echo "$TEXCONTENT" | grep -oP '\\cite(p|t|author|year|)?\{\K[^}]+' | ...)
For biblatex:
1
2
# Change BIBFILE detection:
BIBFILE=$(echo "$TEXCONTENT" | grep -oP '\\addbibresource\{\K[^}]+' | head -1)
Troubleshooting
Issue: Script shows no output
Solution:
1
2
3
4
5
6
7
8
9
# Check file exists and has content
ls -la manuscript.tex
wc -l manuscript.tex
# Check for labels
grep "\\label{" manuscript.tex | head -5
# Run with debug mode
bash -x check_latex_refs.sh manuscript.tex
Issue: Bibliography not found
Solution:
1
2
3
4
5
6
7
8
# Check bibliography command exists
grep "bibliography" manuscript.tex
# Verify .bib file location
ls -la *.bib
# Check file permissions
ls -la references.bib
Issue: Commented lines still appear
Solution:
1
2
3
4
5
6
7
8
# Check sed version
sed --version
# Try alternative comment removal
TEXCONTENT=$(grep -v '^\s*%' "$TEXFILE" | sed 's/[^\\]%.*$//')
# Or use Perl
TEXCONTENT=$(perl -pe 's/([^\\])%.*$/$1/' "$TEXFILE")
Issue: Special characters in labels
Solution:
1
2
3
4
5
# Escape special characters in grep
grep -oP '\\label\{tab:\K[^}]+' | sed 's/[._-]//g'
# Or use simpler pattern
grep "\\label{tab:" | cut -d'{' -f2 | cut -d'}' -f1
Issue: False positives
Solution:
1
2
3
4
5
# Check if label actually exists in source
grep -n "label{tab:problematic}" manuscript.tex
# Verify it's not in a comment
grep "tab:problematic" manuscript.tex | grep -v "^%"
Performance Optimization
For very large documents (>10,000 lines):
1
2
3
4
5
6
7
8
9
# Use parallel processing for multiple files
find . -name "*.tex" | parallel ./check_latex_refs.sh {}
# Cache intermediate results
TEXCONTENT=$(sed 's/\([^\\]\)%.*$/\1/' "$TEXFILE" | tee /tmp/cleaned.tex)
# Use faster grep alternatives
# Install ripgrep: sudo apt install ripgrep
rg -oP '\\label\{tab:\K[^}]+' "$TEXFILE"
How to Cite
Samad, M. A. (2026). LaTeX reference checker: Bash script to find unused labels and missing citations. ScholarsNote. https://doi.org/10.59350/XXXXXXXX-XXXXX
BibTeX:
1
2
3
4
5
6
7
8
9
10
@misc{samad2026latexrefchecker,
author = {Samad, Md Abdus},
title = {LaTeX Reference Checker: Bash Script to Find Unused Labels and Missing Citations},
year = {2026},
month = jan,
howpublished = {ScholarsNote},
url = {https://www.scholarsnote.org/posts/latex-label-prefix-guide-blog/},
doi = {10.59350/XXXXXXXX-XXXXX},
note = {Accessed: 2026-01-03}
}