Post

LaTeX Reference Checker: Bash Script to Find Unused Labels and Missing Citations

A robust bash script that audits LaTeX documents to find unreferenced figures, tables, equations, and citation issues while ignoring commented code.

LaTeX Reference Checker: Bash Script to Find Unused Labels and Missing Citations

When preparing research manuscripts, it is common to accumulate unused labels, orphaned citations, and commented-out content. This bash script provides a comprehensive audit of your LaTeX document, identifying unreferenced figures, tables, and equations, unused bibliography entries, and missing citations – all while properly ignoring commented lines. The script preserves the sequence in which labels appear in your document, making it easier to locate and manage them.

The Problem

During manuscript development, you might encounter:

Issue Example Impact
Unreferenced labels \label{fig:old_analysis} never used Clutters document, confuses reviewers
Commented labels counted % \label{tab:removed} still detected False positives in checks
Missing citations \cite{smith2023} but not in .bib Compilation errors
Unused bibliography References added but never cited Inflates reference count

Manual checking across 50+ pages with multiple revisions becomes impractical.

The Solution

Features

  • Comment-aware: Ignores both full-line and inline comments
  • Sequence-preserving: Shows labels in document order
  • Comprehensive: Checks figures, tables, equations, and citations
  • Multiple reference styles: Supports \ref, \autoref, and \eqref
  • Lightweight: Pure bash, no dependencies
  • Fast: Processes typical manuscripts in less than one second

Installation and Usage

Follow these three simple steps to start using the reference checker:

  1. Save the script - Copy the code below and save it as check_latex_refs.sh in your LaTeX project directory
  2. Make it executable - Run chmod +x check_latex_refs.sh in your terminal
  3. Run the checker - Execute ./check_latex_refs.sh manuscript.tex (replace with your filename)

Step 1: Create the Script

Copy the following code and save it as check_latex_refs.sh in the same directory as your LaTeX file:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
#!/bin/bash

if [ $# -eq 0 ]; then
    echo "Usage: $0 <latex_file.tex>"
    exit 1
fi

TEXFILE=$1

# Remove comments in two steps:
# Step 1: Remove lines that start with % (including whitespace before %)
# Step 2: Remove inline comments (everything after % that's not \%)
TEXCONTENT=$(grep -v '^\s*%' "$TEXFILE" | sed 's/\([^\\]\)%.*$/\1/')

echo "Checking references in: $TEXFILE"
echo "========================================"

# Tables (preserving order, excluding comments)
echo -e "\n=== TABLES ==="
TAB_LABELS=$(echo "$TEXCONTENT" | grep -oP '\\label\{tab:\K[^}]+')
TAB_LABELS_UNIQ=$(echo "$TAB_LABELS" | awk '!seen[$0]++')
# Catches both \ref and \autoref
TAB_REFS=$(echo "$TEXCONTENT" | grep -oP '\\(auto)?ref\{tab:\K[^}]+' | awk '!seen[$0]++')
TAB_COUNT=$(echo "$TAB_LABELS_UNIQ" | grep -v '^$' | wc -l)
REF_COUNT=$(echo "$TAB_REFS" | grep -v '^$' | wc -l)

echo "Total labels: $TAB_COUNT"
echo "Total refs: $REF_COUNT"
echo "Unreferenced tables:"

while IFS= read -r label; do
    if ! echo "$TAB_REFS" | grep -qx "$label"; then
        echo "  tab:$label"
    fi
done <<< "$TAB_LABELS_UNIQ"

# Figures (preserving order, excluding comments)
echo -e "\n=== FIGURES ==="
FIG_LABELS=$(echo "$TEXCONTENT" | grep -oP '\\label\{fig:\K[^}]+')
FIG_LABELS_UNIQ=$(echo "$FIG_LABELS" | awk '!seen[$0]++')
# Catches both \ref and \autoref
FIG_REFS=$(echo "$TEXCONTENT" | grep -oP '\\(auto)?ref\{fig:\K[^}]+' | awk '!seen[$0]++')
FIG_COUNT=$(echo "$FIG_LABELS_UNIQ" | grep -v '^$' | wc -l)
FIGREF_COUNT=$(echo "$FIG_REFS" | grep -v '^$' | wc -l)

echo "Total labels: $FIG_COUNT"
echo "Total refs: $FIGREF_COUNT"
echo "Unreferenced figures:"

while IFS= read -r label; do
    if ! echo "$FIG_REFS" | grep -qx "$label"; then
        echo "  fig:$label"
    fi
done <<< "$FIG_LABELS_UNIQ"

# Equations (preserving order, excluding comments)
echo -e "\n=== EQUATIONS ==="
EQ_LABELS=$(echo "$TEXCONTENT" | grep -oP '\\label\{eq:\K[^}]+')
EQ_LABELS_UNIQ=$(echo "$EQ_LABELS" | awk '!seen[$0]++')
# Catches \ref, \autoref, and \eqref
EQ_REFS=$(echo "$TEXCONTENT" | grep -oP '\\((auto|eq))?ref\{eq:\K[^}]+' | awk '!seen[$0]++')
EQ_COUNT=$(echo "$EQ_LABELS_UNIQ" | grep -v '^$' | wc -l)
EQREF_COUNT=$(echo "$EQ_REFS" | grep -v '^$' | wc -l)

echo "Total labels: $EQ_COUNT"
echo "Total refs: $EQREF_COUNT"
echo "Unreferenced equations:"

while IFS= read -r label; do
    if ! echo "$EQ_REFS" | grep -qx "$label"; then
        echo "  eq:$label"
    fi
done <<< "$EQ_LABELS_UNIQ"

# Citations (excluding comments)
echo -e "\n=== CITATIONS ==="
BIBFILE=$(echo "$TEXCONTENT" | grep -oP '\\bibliography\{\K[^}]+' | head -1)

if [ -n "$BIBFILE" ]; then
    [[ "$BIBFILE" != *.bib ]] && BIBFILE="${BIBFILE}.bib"

    if [ -f "$BIBFILE" ]; then
        echo "Using bibliography file: $BIBFILE"

        BIB_ENTRIES=$(grep -oP '@\w+\{\K[^,]+' "$BIBFILE" | awk '!seen[$0]++')
        CITATIONS=$(echo "$TEXCONTENT" | grep -oP '\\cite[tp]?\{\K[^}]+' | tr ',' '\n' | sed 's/^[[:space:]]*//' | awk '!seen[$0]++')

        BIB_COUNT=$(echo "$BIB_ENTRIES" | grep -v '^$' | wc -l)
        CITE_TOTAL=$(echo "$TEXCONTENT" | grep -oP '\\cite[tp]?\{[^}]+\}' | wc -l)
        CITE_UNIQ=$(echo "$CITATIONS" | grep -v '^$' | wc -l)

        echo "Total bib entries: $BIB_COUNT"
        echo "Total citation commands: $CITE_TOTAL"
        echo "Unique cited keys: $CITE_UNIQ"

        echo "Uncited bibliography entries:"
        while IFS= read -r entry; do
            if ! echo "$CITATIONS" | grep -qx "$entry"; then
                echo "  $entry"
            fi
        done <<< "$BIB_ENTRIES"

        echo "Missing bibliography entries (cited but not in .bib):"
        while IFS= read -r cite; do
            if ! echo "$BIB_ENTRIES" | grep -qx "$cite"; then
                echo "  $cite"
            fi
        done <<< "$CITATIONS"
    else
        echo "Bibliography file not found: $BIBFILE"
    fi
else
    echo "No bibliography file found in document"
fi

echo -e "\n========================================"
echo "Check complete!"

Step 2: Make it Executable

Open your terminal (Bash/CMD/PowerShell), navigate to your project directory, and run:

1
chmod +x check_latex_refs.sh

This command gives the script permission to run on your system.

Step 3: Run the Checker

Execute the script with your LaTeX file:

1
./check_latex_refs.sh manuscript.tex

Replace manuscript.tex with your actual LaTeX filename. The script will analyze your document and display a comprehensive report of all references.

Example Output

Here is what the script reports for a typical manuscript:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
Checking references in: paper.tex
========================================

=== TABLES ===
Total labels: 11
Total refs: 8
Unreferenced tables:
  tab:appendix_data
  tab:extra_results
  tab:summary_stats

=== FIGURES ===
Total labels: 6
Total refs: 6
Unreferenced figures:

=== EQUATIONS ===
Total labels: 15
Total refs: 12
Unreferenced equations:
  eq:supplementary
  eq:variance
  eq:alt_form

=== CITATIONS ===
Using bibliography file: references.bib
Total bib entries: 45
Total citation commands: 52
Unique cited keys: 38
Uncited bibliography entries:
  smith2020old
  jones2019unused
  brown2018extra
Missing bibliography entries (cited but not in .bib):
  nguyen2023missing

========================================
Check complete!

The script detects references made with \ref{fig:one}, \autoref{tab:results}, or \eqref{eq:main} – all are counted correctly.

Technical Deep Dive

How It Works

1. Comment Removal

The script uses a two-step approach to handle comments:

1
2
3
# Step 1: Remove lines starting with %
# Step 2: Remove inline comments but preserve \%
TEXCONTENT=$(grep -v '^\s*%' "$TEXFILE" | sed 's/\([^\\]\)%.*$/\1/')

The order matters. Full-line comments must be removed first, then inline comments.

Test cases:

LaTeX Code Processed As
\label{fig:test} Included
% \label{fig:old} Excluded
Text \ref{fig:a} % comment \ref{fig:a} included
50\% efficiency \% preserved

2. Sequence Preservation

Uses awk '!seen[$0]++' to remove duplicates while maintaining order:

1
2
3
4
5
# Traditional approach (loses order)
sort -u

# Our approach (preserves order)
awk '!seen[$0]++'

Why this matters:

If your document has:

1
2
3
4
5
6
7
8
\section{Methods}        % Line 50
\label{fig:workflow}     % Line 65

\section{Results}        % Line 150
\label{fig:accuracy}     % Line 170

\section{Appendix}       % Line 300
\label{fig:extra}        % Line 310 (unreferenced)

The output shows fig:extra in document order (not alphabetically sorted), making it easier to locate.

3. Pattern Matching

Uses Perl-compatible regex with grep -oP:

1
2
3
4
5
# For tables and figures: catches both \ref and \autoref
grep -oP '\\(auto)?ref\{tab:\K[^}]+'

# For equations: catches \ref, \autoref, and \eqref
grep -oP '\\((auto|eq))?ref\{eq:\K[^}]+'

Breakdown:

  • \\(auto)?ref\{tab: – Match \ref{tab: or \autoref{tab:
  • \\((auto|eq))?ref\{eq: – Match \ref{eq:, \autoref{eq:, or \eqref{eq:
  • \K – Discard everything matched so far
  • [^}]+ – Capture everything until }

Supported reference commands:

  • \ref{...} – Standard LaTeX reference
  • \autoref{...} – Automatic reference from hyperref package
  • \eqref{...} – Equation reference (for equations only)

Examples:

Labels detected:

  • \label{tab:results} extracts results
  • \label{fig:analysis_2023} extracts analysis_2023
  • \label{eq:main_theorem} extracts main_theorem

References detected:

  • \ref{fig:diagram} matches diagram
  • \autoref{tab:results} matches results
  • \eqref{eq:einstein} matches einstein

Not matched:

  • % \label{tab:old} (filtered by comment removal)
  • \label{sec:intro} (different prefix – use section extension)

Advanced Usage

1. Check Multiple Files

Create check_all.sh:

1
2
3
4
5
6
7
#!/bin/bash

for file in *.tex; do
    echo "File: $file"
    ./check_latex_refs.sh "$file"
    echo ""
done

2. Generate Timestamped Reports

1
2
3
4
5
6
7
8
# Create dated log
./check_latex_refs.sh paper.tex > "audit_$(date +%Y%m%d_%H%M%S).log"

# Compare before/after revision
./check_latex_refs.sh paper.tex > before_revision.log
# ... make changes ...
./check_latex_refs.sh paper.tex > after_revision.log
diff before_revision.log after_revision.log

3. Git Pre-commit Hook

Add to .git/hooks/pre-commit:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#!/bin/bash

# Run check
./check_latex_refs.sh main.tex > ref_check.log

# Count unreferenced items
UNREFERENCED=$(grep -c "^  " ref_check.log)

if [ $UNREFERENCED -gt 0 ]; then
    echo "Warning: $UNREFERENCED unreferenced items found"
    cat ref_check.log

    read -p "Continue commit? (y/n) " -n 1 -r
    echo
    if [[ ! $REPLY =~ ^[Yy]$ ]]; then
        exit 1
    fi
fi

4. Makefile Integration

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Makefile

.PHONY: check clean all

all: manuscript.pdf check

manuscript.pdf: manuscript.tex
	pdflatex manuscript.tex
	bibtex manuscript
	pdflatex manuscript.tex
	pdflatex manuscript.tex

check:
	@./check_latex_refs.sh manuscript.tex

clean:
	rm -f *.aux *.log *.bbl *.blg *.out *.toc

audit:
	@./check_latex_refs.sh manuscript.tex > audit_$(shell date +%Y%m%d).log
	@cat audit_$(shell date +%Y%m%d).log

Usage:

1
2
3
make              # Compile and check
make check        # Run reference check only
make audit        # Generate timestamped audit

5. Pre-submission Checklist Script

Create pre_submit.sh:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#!/bin/bash

echo "Pre-submission Checklist"
echo "========================"

# 1. Reference check
echo "1. Checking references..."
./check_latex_refs.sh manuscript.tex > ref_audit.log

# 2. Count issues
ISSUES=$(grep -c "^  " ref_audit.log)
if [ $ISSUES -eq 0 ]; then
    echo "   No unreferenced items"
else
    echo "   $ISSUES unreferenced items found"
    cat ref_audit.log
fi

# 3. Check compilation
echo "2. Checking compilation..."
pdflatex -interaction=nonstopmode manuscript.tex > /dev/null 2>&1
if [ $? -eq 0 ]; then
    echo "   Compiles successfully"
else
    echo "   Compilation errors found"
fi

# 4. Check bibliography
echo "3. Checking bibliography..."
bibtex manuscript > /dev/null 2>&1
echo "   Bibliography processed"

echo "========================"
echo "Checklist complete!"

Extensions and Customizations

Add Line Numbers

Modify the unreferenced item display to show line numbers:

1
2
3
4
5
6
while IFS= read -r label; do
    if ! echo "$TAB_REFS" | grep -qx "$label"; then
        LINE=$(grep -n "\\label{tab:$label}" "$TEXFILE" | cut -d: -f1)
        echo "  tab:$label (line $LINE)"
    fi
done <<< "$TAB_LABELS_UNIQ"

Output:

1
2
3
Unreferenced tables:
  tab:appendix_data (line 145)
  tab:extra_results (line 203)

Add Section Labels

Add this after the equations section:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# Sections (preserving order, excluding comments)
echo -e "\n=== SECTIONS ==="
SEC_LABELS=$(echo "$TEXCONTENT" | grep -oP '\\label\{sec:\K[^}]+')
SEC_LABELS_UNIQ=$(echo "$SEC_LABELS" | awk '!seen[$0]++')
SEC_REFS=$(echo "$TEXCONTENT" | grep -oP '\\ref\{sec:\K[^}]+' | awk '!seen[$0]++')
SEC_COUNT=$(echo "$SEC_LABELS_UNIQ" | grep -v '^$' | wc -l)
SECREF_COUNT=$(echo "$SEC_REFS" | grep -v '^$' | wc -l)

echo "Total section labels: $SEC_COUNT"
echo "Total section refs: $SECREF_COUNT"
echo "Unreferenced sections:"

while IFS= read -r label; do
    if ! echo "$SEC_REFS" | grep -qx "$label"; then
        echo "  sec:$label"
    fi
done <<< "$SEC_LABELS_UNIQ"

Color Output

Add ANSI color codes for better visibility:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# Add at top of script (after shebang)
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color

# Use in output
echo -e "${BLUE}=== TABLES ===${NC}"
echo "Total labels: $TAB_COUNT"
echo "Total refs: $REF_COUNT"

if [ "$TAB_COUNT" -eq "$REF_COUNT" ]; then
    echo -e "${GREEN}All tables referenced${NC}"
else
    echo -e "${RED}Unreferenced tables:${NC}"
    while IFS= read -r label; do
        if ! echo "$TAB_REFS" | grep -qx "$label"; then
            echo -e "  ${YELLOW}tab:$label${NC}"
        fi
    done <<< "$TAB_LABELS_UNIQ"
fi

Export to CSV

Create a CSV export function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#!/bin/bash
# Export unreferenced items to CSV

TEXFILE=$1
OUTFILE="${TEXFILE%.tex}_unreferenced.csv"

# Run the check and extract unreferenced items
./check_latex_refs.sh "$TEXFILE" > /tmp/ref_check.log

# Create CSV header
echo "Type,Label,Section" > "$OUTFILE"

# Extract and format
grep "^  tab:" /tmp/ref_check.log | sed 's/^  /table,/' >> "$OUTFILE"
grep "^  fig:" /tmp/ref_check.log | sed 's/^  /figure,/' >> "$OUTFILE"
grep "^  eq:" /tmp/ref_check.log | sed 's/^  /equation,/' >> "$OUTFILE"

echo "CSV exported to: $OUTFILE"

JSON Output

Add JSON export capability:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#!/bin/bash
# Add --json flag support

if [ "$2" == "--json" ]; then
    # Extract data
    TAB_LABELS=$(echo "$TEXCONTENT" | grep -oP '\\label\{tab:\K[^}]+' | awk '!seen[$0]++')
    TAB_REFS=$(echo "$TEXCONTENT" | grep -oP '\\ref\{tab:\K[^}]+' | awk '!seen[$0]++')

    # Build unreferenced array
    UNREF_TABS=$(comm -23 <(echo "$TAB_LABELS" | sort) <(echo "$TAB_REFS" | sort) | \
        awk '{printf "\"%s\",", $0}' | sed 's/,$//')

    # Output JSON
    cat << EOF
{
  "file": "$TEXFILE",
  "timestamp": "$(date -Iseconds)",
  "tables": {
    "total_labels": $(echo "$TAB_LABELS" | grep -v '^$' | wc -l),
    "total_refs": $(echo "$TAB_REFS" | grep -v '^$' | wc -l),
    "unreferenced": [$UNREF_TABS]
  }
}
EOF
fi

Workflow Integration Examples

1. Continuous Integration (CI)

For GitHub Actions (.github/workflows/latex-check.yml):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
name: LaTeX Reference Check

on: [push, pull_request]

jobs:
  check-refs:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2

      - name: Run reference checker
        run: |
          chmod +x check_latex_refs.sh
          ./check_latex_refs.sh manuscript.tex

      - name: Count issues
        run: |
          ISSUES=$(./check_latex_refs.sh manuscript.tex | grep "^  " | wc -l)
          echo "Found $ISSUES unreferenced items"
          if [ $ISSUES -gt 5 ]; then
            echo "Too many unreferenced items!"
            exit 1
          fi

2. VS Code Task

Add to .vscode/tasks.json:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
{
  "version": "2.0.0",
  "tasks": [
    {
      "label": "Check LaTeX References",
      "type": "shell",
      "command": "./check_latex_refs.sh",
      "args": ["${file}"],
      "problemMatcher": [],
      "presentation": {
        "reveal": "always",
        "panel": "new"
      }
    }
  ]
}

3. Overleaf/Git Sync

For projects synced between Overleaf and Git:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#!/bin/bash
# sync_and_check.sh

# Pull from Overleaf
git pull origin master

# Run check
./check_latex_refs.sh main.tex > ref_check.log

# If clean, push changes
if [ $(grep -c "^  " ref_check.log) -eq 0 ]; then
    git add .
    git commit -m "Update: references checked"
    git push origin master
else
    echo "Unreferenced items found. Fix before pushing."
    cat ref_check.log
fi

Comparison with Alternatives

Tool Pros Cons Best For
This script Fast, customizable, comment-aware, preserves order Single file only Quick audits, CI/CD
refcheck package LaTeX-integrated, visual markers Must recompile, clutters PDF During writing
chktex Comprehensive linting, many checks Verbose, complex output Deep analysis
VS Code LaTeX Workshop Real-time, editor-integrated Editor-specific, no batch Active editing
Python scripts Very flexible, HTML reports Slower, requires Python Custom workflows

Real-World Case Study

Initial State

Manuscript for journal submission with 8 months of revisions:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
=== TABLES ===
Total labels: 8
Total refs: 7
Unreferenced tables:
  tab:correlation_matrix

=== FIGURES ===
Total labels: 12
Total refs: 12

=== EQUATIONS ===
Total labels: 6
Total refs: 6

=== CITATIONS ===
Total bib entries: 67
Uncited bibliography entries:
  lecun2015deep
  goodfellow2016deep
  chollet2017deep
  bishop2006pattern
  murphy2012machine
  hastie2009elements
  james2013introduction
  vapnik1995nature
  cover2006elements

Actions Taken

  1. Moved tab:correlation_matrix to supplementary materials
  2. Removed 9 unused deep learning references (leftover from earlier drafts)
  3. Cleaned up bibliography from 67 to 58 entries
  4. Verified all remaining citations were relevant

Final State

1
2
3
4
5
6
7
8
=== TABLES ===
Total labels: 7
Total refs: 7
Unreferenced tables:

=== CITATIONS ===
Total bib entries: 58
Uncited bibliography entries:

Time saved: Approximately 30 minutes of manual checking Outcome: Clean submission, no reviewer comments about references

Limitations and Workarounds

Current Limitations

  1. Single file only: Does not traverse \input{} or \include{} commands
  2. Standard prefixes: Assumes tab:, fig:, eq:, sec: conventions
  3. Reference commands: Detects \ref, \autoref, and \eqref (for equations)
  4. Citation commands: Only detects \cite, \citep, \citet
  5. Bibliography format: Requires standard \bibliography{} command

Workarounds

For multi-file projects:

1
2
3
4
# Combine files first
cat main.tex chapter*.tex appendix.tex > combined.tex
./check_latex_refs.sh combined.tex
rm combined.tex

For custom prefixes:

1
2
3
# Modify the grep patterns in the script
# Change tab: to tbl:
grep -oP '\\label\{tbl:\K[^}]+'

For additional citation commands:

1
2
# Add to CITATIONS line:
CITATIONS=$(echo "$TEXCONTENT" | grep -oP '\\cite(p|t|author|year|)?\{\K[^}]+' | ...)

For biblatex:

1
2
# Change BIBFILE detection:
BIBFILE=$(echo "$TEXCONTENT" | grep -oP '\\addbibresource\{\K[^}]+' | head -1)

Troubleshooting

Issue: Script shows no output

Solution:

1
2
3
4
5
6
7
8
9
# Check file exists and has content
ls -la manuscript.tex
wc -l manuscript.tex

# Check for labels
grep "\\label{" manuscript.tex | head -5

# Run with debug mode
bash -x check_latex_refs.sh manuscript.tex

Issue: Bibliography not found

Solution:

1
2
3
4
5
6
7
8
# Check bibliography command exists
grep "bibliography" manuscript.tex

# Verify .bib file location
ls -la *.bib

# Check file permissions
ls -la references.bib

Issue: Commented lines still appear

Solution:

1
2
3
4
5
6
7
8
# Check sed version
sed --version

# Try alternative comment removal
TEXCONTENT=$(grep -v '^\s*%' "$TEXFILE" | sed 's/[^\\]%.*$//')

# Or use Perl
TEXCONTENT=$(perl -pe 's/([^\\])%.*$/$1/' "$TEXFILE")

Issue: Special characters in labels

Solution:

1
2
3
4
5
# Escape special characters in grep
grep -oP '\\label\{tab:\K[^}]+' | sed 's/[._-]//g'

# Or use simpler pattern
grep "\\label{tab:" | cut -d'{' -f2 | cut -d'}' -f1

Issue: False positives

Solution:

1
2
3
4
5
# Check if label actually exists in source
grep -n "label{tab:problematic}" manuscript.tex

# Verify it's not in a comment
grep "tab:problematic" manuscript.tex | grep -v "^%"

Performance Optimization

For very large documents (>10,000 lines):

1
2
3
4
5
6
7
8
9
# Use parallel processing for multiple files
find . -name "*.tex" | parallel ./check_latex_refs.sh {}

# Cache intermediate results
TEXCONTENT=$(sed 's/\([^\\]\)%.*$/\1/' "$TEXFILE" | tee /tmp/cleaned.tex)

# Use faster grep alternatives
# Install ripgrep: sudo apt install ripgrep
rg -oP '\\label\{tab:\K[^}]+' "$TEXFILE"

How to Cite

Samad, M. A. (2026). LaTeX reference checker: Bash script to find unused labels and missing citations. ScholarsNote. https://doi.org/10.59350/XXXXXXXX-XXXXX

BibTeX:

1
2
3
4
5
6
7
8
9
10
@misc{samad2026latexrefchecker,
  author       = {Samad, Md Abdus},
  title        = {LaTeX Reference Checker: Bash Script to Find Unused Labels and Missing Citations},
  year         = {2026},
  month        = jan,
  howpublished = {ScholarsNote},
  url          = {https://www.scholarsnote.org/posts/latex-label-prefix-guide-blog/},
  doi          = {10.59350/XXXXXXXX-XXXXX},
  note         = {Accessed: 2026-01-03}
}
This post is licensed under CC BY 4.0 by the author.