[personal profile] idupree
(The canonical location of this blog post is now Code filesystem trees & dependency digraphs on my web site.)

Consider a collection of source code. It's arranged in a structure in the filesystem. Often, each file is a module named according to its directory-and-file-path. The filesystem structure is a tree (which is a directed graph).

At the same time, the code forms a dependency tree/DAG. A module is a node, and a module importing another module forms an edge. Alternately, items in the code such as classes or functions can be nodes, and an item using another item forms an edge. This is usually a directed acyclic graph. In some cases, cycles are allowed.

In good code, is there a similarity between the filesystem structure and the dependency structure?

Sometimes a library has interface modules that import implementation modules deeper in the filesystem hierarchy. Sometime implementation modules import a "utils" module that's their cousin. In fact, the entire standard library is a cousin to most code, and most code imports something from the standard library. Modules rarely import their direct parents (e.g. "MyLibrary.Types" wouldn't import "MyLibrary" even if "MyLibrary" imports "MyLibrary.Types"). That's not a lot to go on.

What sort of similarity might we look for? Can the nodes of each graph be the same? Both can have files as nodes. However, the filesystem tree has files as leaf nodes and directories as non-leaf nodes. The dependency graph usually does not have directories as nodes at all, and has files as both leaf and non-leaf nodes. We could make the graphs look more similar by modifying the dependency graph. Instead of files being nodes, directories could be nodes that are the combination of all their direct file children's contents. This is tempting, though it throws away a lot of information. Alternatively, we could try to determine a correspondence between the two structures' nodes using graph-theoretic algorithms. This might be interesting (but it might not

What can we do without looking for similarities?
  • Suppose MyLibrary.A imports YourLibrary.B and YourLibrary.C imports MyLibrary.D. That's often a bad idea even when it doesn't technically involve an import cycle.

  • We can detect unused imports. Some languages make them easy to detect; some (like C++) make it quite difficult. But unused imports are a detail to clean up, not a sign of badly organized code.

  • We can compute how balanced the filesystem tree or dependency graph is ("balancedness" as in "balanced binary trees"). We can compute how deep it is (tree depth is simple; digraph depth could be longest non-repeating path). Probably there is a level of unbalancedness, and a level of shallowness or depth (for a given amount of code), extreme enough that it would be bad style.

  • Etc.?

- still contemplating,
Anonymous( )Anonymous This account has disabled anonymous posting.
OpenID( )OpenID You can comment on this post while signed in with an account from many other sites, once you have confirmed your email address. Sign in using OpenID.
Account name:
If you don't have an account you can create one now.
HTML doesn't work in the subject.


If you are unable to use this captcha for any reason, please contact us by email at support@dreamwidth.org

Notice: This account is set to log the IP addresses of everyone who comments.
Links will be displayed as unclickable URLs to help prevent spam.



January 2014


Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Oct. 22nd, 2017 04:58 am
Powered by Dreamwidth Studios