Instead boringly be replicating the directory base name where gitdm is installed and write it on each option inside the configuration file, just send it through the command line. Signed-off-by: Tiago Vignatti <tiago.vignatti@nokia.com>
151 lines
4.9 KiB
Plaintext
151 lines
4.9 KiB
Plaintext
The code in this directory makes up the "git data miner," a simple hack
|
|
which attempts to figure things out from the revision history in a git
|
|
repository.
|
|
|
|
|
|
INSTALLING GITDM
|
|
|
|
gitdm is a python script and doesn't need to be proper installed like other
|
|
normal programs. You just have to adjust your PATH variable, pointing it to
|
|
the directory of gitdm or alternatively create a symbolic link of the script
|
|
inside /usr/bin.
|
|
|
|
Before actually run gitdm you may want also to update the configuration file
|
|
(gitdm.config) with the needed information.
|
|
|
|
|
|
RUNNING GITDM
|
|
|
|
Run it like this:
|
|
|
|
git log -p -M [details] | gitdm [options]
|
|
|
|
The [details] tell git which changesets are of interest; the [options] can
|
|
be:
|
|
|
|
-a If a patch contains signoff lines from both Andrew Morton
|
|
and Linus Torvalds, omit Linus's.
|
|
|
|
-b dir Specify the base directory to fetch the configuration files.
|
|
|
|
-c file Specify the name of the gitdm configuration file.
|
|
By default, "./gitdm.config" is used.
|
|
|
|
-d Omit the developer reports, giving employer information
|
|
only.
|
|
|
|
-D Rather than create the usual statistics, create a
|
|
file (datelc) providing lines changed per day, where the first column
|
|
displays the changes happened only on that day and the second sums
|
|
the day it happnened with the previous ones. This option is suitable
|
|
for feeding to a tool like gnuplot.
|
|
|
|
-h file Generate HTML output to the given file
|
|
|
|
-l num Only list the top <num> entries in each report.
|
|
|
|
-o file Write text output to the given file (default is stdout).
|
|
|
|
-r pat Only generate statistics for changes to files whose
|
|
name matches the given regular expression.
|
|
|
|
-s Ignore Signed-off-by lines which match the author of
|
|
each patch.
|
|
|
|
-u Group all unknown developers under the "(Unknown)"
|
|
employer.
|
|
|
|
-z Dump out the hacker database to "database.dump".
|
|
|
|
A typical command line used to generate the "who write 2.6.x" LWN articles
|
|
looks like:
|
|
|
|
git log -p -M v2.6.19..v2.6.20 | \
|
|
gitdm -u -s -a -o results -h results.html
|
|
|
|
|
|
CONFIGURATION FILE
|
|
|
|
The main purpose of the configuration file is to direct the mapping of
|
|
email addresses onto employers. Please note that the config file parser is
|
|
exceptionally stupid and unrobust at this point, but it gets the job done.
|
|
|
|
Blank lines and lines beginning with "#" are ignored. Everything else
|
|
specifies a file with some sort of mapping:
|
|
|
|
EmailAliases file
|
|
|
|
Developers often post code under a number of different email
|
|
addresses, but it can be desirable to group them all together in
|
|
the statistics. An EmailAliases file just contains a bunch of
|
|
lines of the form:
|
|
|
|
alias@address canonical@address
|
|
|
|
Any patches originating from alias@address will be treated as if
|
|
they had come from canonical@address.
|
|
|
|
|
|
EmailMap file
|
|
|
|
Map email addresses onto employers. These files contain lines
|
|
like:
|
|
|
|
[user@]domain employer [< yyyy-mm-dd]
|
|
|
|
If the "user@" portion is missing, all email from the given domain
|
|
will be treated as being associated with the given employer. If a
|
|
date is provided, the entry is only valid up to that date;
|
|
otherwise it is considered valid into the indefinite future. This
|
|
feature can be useful for properly tracking developers' work when
|
|
they change employers but do not change email addresses.
|
|
|
|
|
|
GroupMap file employer
|
|
|
|
This is a variant of EmailMap provided for convenience; it contains
|
|
email addresses only, all of which are associated with the given
|
|
employer.
|
|
|
|
OTHER TOOLS
|
|
|
|
A few other tools have been added to this repository:
|
|
|
|
treeplot
|
|
Reads a set of commits, then generates a graphviz file charting the
|
|
flow of patches into the mainline. Needs to be smarter, but, then,
|
|
so does everything else in this directory.
|
|
|
|
findoldfiles
|
|
Simple brute-force crawler which outputs the names of any files
|
|
which have not been touched since the original (kernel) commit.
|
|
|
|
committags
|
|
I needed to be able to quickly associate a given commit with the
|
|
major release which contains it. First attempt used
|
|
"git tags --contains="; after it ran for a solid week, I concluded
|
|
there must be a better way. This tool just reads through the repo,
|
|
remembering tags, and creating a Python dictionary containing the
|
|
association. The result is an ugly 10mb pickle file, but, even so,
|
|
it's still a better way.
|
|
|
|
linetags
|
|
Crawls through a directory hierarchy, counting how many lines of
|
|
code are associated with each major release. Needs the pickle file
|
|
from committags to get the job done.
|
|
|
|
|
|
NOTES AND CREDITS
|
|
|
|
Gitdm was written by Jonathan Corbet; many useful contributions have come
|
|
from Greg Kroah-Hartman.
|
|
|
|
Please note that this tool is provided in the hope that it will be useful,
|
|
but it is not put forward as an example of excellence in design or
|
|
implementation. Hacking on gitdm tends to stop the moment it performs
|
|
whatever task is required of it at the moment. Patches to make it less
|
|
hacky, less ugly, and more robust are welcome.
|
|
|
|
Jonathan Corbet
|
|
corbet@lwn.net
|