How to Count Lines of Source Code

This is a memo on how to measure the number of lines in a source code repository.

Introduction

While reading source code, I sometimes wonder how many lines the entire repository has.

Here's a memo on how to do it.

Note: This article was translated from my original post.

How to Count Lines of Source Code

Count Lines with a Command

You can quickly count the total number of lines in a git repository with the following command:

git ls-files | xargs wc -l

git ls-files lists all files managed in the git repository, and xargs wc -l counts the lines.


If you want to count only specific files, you can use grep to filter them, like this:

git ls-files | grep '\.go' | xargs wc -l

In this example, only files with the .go extension are counted. You can filter any files using a regular expression with grep.


As a concrete example, let's count the lines of .py files in the requests library.

# git clone https://github.com/psf/requests.git
# cd requests
git ls-files | grep '\.py' | xargs wc -l
      86 docs/_themes/flask_theme_support.py
     386 docs/conf.py
     132 setup.py
     180 src/requests/__init__.py
      14 src/requests/__version__.py
      50 src/requests/_internal_utils.py
     540 src/requests/adapters.py
     157 src/requests/api.py
     314 src/requests/auth.py
      17 src/requests/certs.py
      79 src/requests/compat.py
     561 src/requests/cookies.py
     151 src/requests/exceptions.py
     134 src/requests/help.py
      33 src/requests/hooks.py
    1032 src/requests/models.py
      30 src/requests/packages.py
     831 src/requests/sessions.py
     128 src/requests/status_codes.py
      99 src/requests/structures.py
    1094 src/requests/utils.py
      14 tests/__init__.py
      23 tests/compat.py
      58 tests/conftest.py
       8 tests/test_adapters.py
      27 tests/test_help.py
      22 tests/test_hooks.py
     428 tests/test_lowlevel.py
      13 tests/test_packages.py
    2839 tests/test_requests.py
      78 tests/test_structures.py
     165 tests/test_testserver.py
     926 tests/test_utils.py
       0 tests/testserver/__init__.py
     134 tests/testserver/server.py
      17 tests/utils.py
   10800 total

The total .py code has 10,800 lines.

Count Lines with Tools

There are many tools available to count lines of source code, and even a quick search reveals several popular ones:

We'll try the well-known cloc.

First, install cloc using your preferred package manager:

npm install -g cloc              # https://www.npmjs.com/package/cloc
sudo apt install cloc            # Debian, Ubuntu
sudo yum install cloc            # Red Hat, Fedora
sudo dnf install cloc            # Fedora 22 or later
sudo pacman -S cloc              # Arch
sudo emerge -av dev-util/cloc    # Gentoo https://packages.gentoo.org/packages/dev-util/cloc
sudo apk add cloc                # Alpine Linux
doas pkg_add cloc                # OpenBSD
sudo pkg install cloc            # FreeBSD
sudo port install cloc           # macOS with MacPorts
brew install cloc                # macOS with Homebrew
winget install AlDanial.Cloc     # Windows with winget
choco install cloc               # Windows with Chocolatey
scoop install cloc               # Windows with Scoop


Now, let's count the requests library using cloc:

# git clone https://github.com/psf/requests.git
# cd requests
cloc .
      93 text files.
      81 unique files.                              
      23 files ignored.

github.com/AlDanial/cloc v 2.02  T=0.28 s (285.3 files/s, 60468.6 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                          35           1997           1994           6809
reStructuredText                16            858            243           1931
Markdown                         9            589              6           1626
DOS Batch                        1             34              2            227
make                             2             34              7            202
YAML                             9             33             36            177
CSS                              1             32              2            143
HTML                             3             30              3            114
INI                              1              3              0             15
Text                             2              0              0             10
TOML                             1              1              0              9
SVG                              1              0              0              1
-------------------------------------------------------------------------------
SUM:                            81           3611           2293          11264
-------------------------------------------------------------------------------

You can see the number of files, blank lines, comment lines, and code lines per language.

For Python code: 1997 (blank) + 1994 (comment) + 6809 (code) = 10800 (total)

It matches the previous manual counting result.

You can also specify git commits or measure differences. If you are interested, please refer to the official cloc repository.

Conclusion

That's a quick memo on how to count the number of source code lines in a git repository.

Hope it helps someone!

[Related Articles]

en.bioerrorlog.work

References