As an experiment I tried running
jekyll --lsi. What I would get out of this
is an automatic link of related posts after every post (where applicable).
I have no need for this, but it could have been interesting. Unfortunately,
this is how things look so far:
17465 atubbs 20 0 920m 877m 2984 R 101 11.0 2006:17 jekyll
It’s been running for 33 hours. I don’t have a good sense for whether it’s close or not at this point. I’ve got a few dozen dots on my screen to tell me progress. I hope those dots don’t represent articles, or else the universe is going to end before this would complete.
Interesting or not, multi-day site rebuilds aren’t appealing. Keep in mind this is with ruby-gsl and it would be much slower without. Ouch.
This is consistent with my general impression of jekyll so far. It bogs down when confronted with something larger than a trivial site (I only have about 1600 posts with about 7.3M of textile/markdown).
I may still be using Jekyll incorrectly. I switched to Jekyll’s notion of
continuous integration with
jekyll --server --auto. Jekyll then detects
if a file has changed and does the needful. In reality this forces a pretty
substantial rebuild, as all of the paging files and folders are rebuilt.
This happens even if all that changes is a spelling adjustment in an
existing post. In any event, like this workflow, greatly dislike the
abysmal performance of the thing. Not sure whether it’s easier to add more
threads or smarter update logic.
Since latent semantic indexing doesn’t scale, I implemented a mechanism in my templates for providing related articles. The downside is that I’ll have to populate and curate this list myself. The upside is that it’s fast. For now I’m only enabling it in the single-post page; it won’t appear in feeds or on the index pages.