Enigmatic Code

Programming Enigma Puzzles

Enigma 288a: Multiplets

notableFrom New Scientist #1435, 20th December 1984 [link]

A multiplet is a given set of words, and the game is to connect them with a network of as few link-words as possible. In the network, any two adjacent words are related in one of three ways:

(a) one letter is changed: e.g. MINUS → MINES
(b) one letter is added/removed: e.g. PLUS → PLUMS
(c) anagram: e.g. PLUMS → LUMPS or PLUMS → SLUMP

The picture shows a network for the multiplet “MINUS, ZERO, PLUS”, with 11 link-words.

Enigma 288a

For your Christmas prize, the multiplet is the first 10 natural numbers, “ONE, TWO, …, TEN”. Can you connect them with 15 or fewer link-words? If so, how?

All link-words must be of three or more letters. They must be in the Concise Oxford Dictionary, or simple inflexions of words in it. Words there marked as abbreviations are not allowed; so “SEN” for instance is not permitted. Common words are preferable to obscure ones.

[enigma288a] [enigma288]

Advertisements

3 responses to “Enigma 288a: Multiplets

  1. Jim Randell 14 June 2015 at 10:19 am

    I thought this was a tough problem to solve programatically.

    For a start I didn’t have a list of allowable words. I did find a file on my system under /usr/share/dict/web2 that is a list of words from Webster’s Second International [Dictionary], published in 1934. It includes SEN (which is apparently a monetary unit in some countries), and quite a lot of other obscure words that I didn’t really want in my solutions, so I used a list of stop words to weed out words I thought were too esoteric when they showed up.

    The problem with this list is that it doesn’t also include “simple inflexions of words” in the list (such as plurals, which are clearly useful for the example given with PLUS, MINUS and ZERO; for example, PLUMS does not appear in the list), but it didn’t look as though they would be particularly useful for the main problem, so I ignored this deficiency in my word list.

    (If anyone knows of a more suitable wordlist available for download I like to hear from them).

    The next problem was that the wordlist I did have has around 234,000 words in it. Constructing the complete graph that links all the words would take too long, so we need to work on a subset of the graph. And even when I have a suitable subgraph the problem of finding the smallest connected subgraph of a graph is an NP-hard problem (see Steiner Tree in Graphs).

    To try and make some progress on the problem I adopted “greedy” strategies for both of these issues.

    To construct a connected graph I use the following algorithm:

    1. start with a collection of graphs, initially each graph consists of a single vertex, one graph for each target vertex;

    2. while there is more than one graph in the collection, do the following:

    2.1a. if two of the graphs intersect (i.e. have a vertex in common), then coalesce the two graphs;

    2.1b. if all the graphs are disjoint then expand one of the graphs by adding in new vertices that can be reached from the perimeter of the graph.

    When the algorithm terminates the collection of graphs consists of a single graph that contains all the target vertices and is connected. Although the graph has been constructed with minimal “effort” there is no guarantee that the graph contains a minimally connected subgraph of the complete graph.

    So we can now search the constructed graph to try and find a connected subgraph.

    The strategy I used to try and connect the target vertices in an efficient way is as follows.

    1. start with a (non-empty) collection of vertices of the graph (called the “trunk”), we can just pick one of the target vertices.

    2. while there are target vertices remaining outside the trunk, do the following:

    2.1. for each target vertex consider all possible shortest paths from that vertex to the trunk. for each intermediate vertex in the path record which target vertex uses it.

    2.2. find an intermediate vertex that is used by the most target vertices.

    2.3a. if there are no intermediate vertices used by more than one target vertex, then complete the subgraph by adding in any shortest path from each remaining target vertex.

    2.3b. if there are intermediate vertices used by multiple target vertices then expand the trunk by adding in one such intermediate vertex and one of the shortest paths from that vertex to the trunk.

    When the algorithm terminates all target vertices are in the trunk (and if the trunk was initially connected then it will still be connected).

    This algorithm uses shortest paths to find a subgraph, but there is no guarantee that it will find the smallest possible subgraph.

    But we are not asked to find the smallest possible subgraph, only if it is possible to achieve a subgraph containing the 10 target vertices and 15 or fewer intermediate vertices (i.e. 25 or fewer vertices in total).

    So I implemented the algorithms in the Python program to appear below (I need to tidy it up a little before I post it). I tried each target vertex as the initial “trunk” and found that several of them did indeed produce a subgraph with 25 vertices.

    Solution: Yes, the words: ONE, TWO, THREE, FOUR, FIVE, SIX, SEVEN, EIGHT, NINE, TEN can be connected with 15 link words.

    Here is a diagram that shows some example solutions:

    Enigma 288a - Solution

    In the vertices with two options in them (e.g. “TOO/TOW”), either word can be used without changing the shape of the graph. There are also some cycles in the graph, which give rise to multiple spanning trees of the graph, any one of which is a viable solution.

    I prefer not to use the archaic terms THINE and THEE in my solutions, although they are in the dictionary.

    I found that choosing THREE or FIVE as the initial “trunk” gave a subgraph with 25 vertices (the others give a subgraph of size 26). Also, rather pleasingly, the solution given above can be found by selecting a “trunk” of TREE.

    The published example solution is similar (using THINE, TOO, FOR and TREE), but replaces TOE with TON, which then links to TEN instead of TEE.

  2. Jim Randell 14 June 2015 at 12:51 pm

    I found a list of Scrabble words online, the file is called sowpods.txt and contains 267,627 candidate words (including SEN), and it includes word inflexions. Here’s my Python 3 code (of course you also need a wordlist for it to work). You can add words that you don’t like to the list of stopwords.

    from collections import defaultdict
    from itertools import combinations
    from enigma import diff, Accumulator, printf
    
    # wordlist
    file = 'sowpods.txt'
    
    # stop words (add words you want to exclude here)
    stop = set([
      'sen', # SEN is explicitly disallowed
    ])
    
    # make the set of candidate words
    words = set()
    for w in open(file).readlines():
      w = w.strip().lower()
      if w.startswith('#'): continue
      if len(w) < 3: continue
      if w in stop: continue
      words.add(w)
    printf("{n} candidate words loaded from {file}", n=len(words))
    
    # extend the graph <g> with links between <ws1> and <ws2>
    # return the nodes added to the graph
    def extend(g, ws1, ws2):
    
      ps = set()
      for w1 in ws1:
        n1 = len(w1)
        s1 = sorted(w1)
        for w2 in ws2:
          if w2 == w1: continue
          n2 = len(w2)
          r = n2 - n1
    
          if any((
            # are they anagrams
            (r == 0 and sorted(w2) == s1),
    
            # has one letter been changed?
            (r == 0 and sum(1 for (c1, c2) in zip(w1, w2) if c1 != c2) == 1),
              
            # has one letter been added
            (r == 1 and any(w2[:i] + w2[i + 1:] == w1 for (i, _) in enumerate(w2))),
    
            # has one letter been removed
            (r == -1 and any(w1[:i] + w1[i + 1:] == w2 for (i, _) in enumerate(w1))),
          )):
            g[w1].add(w2)
            g[w2].add(w1)
            ps.add(w2)
      return ps
    
    # expand the graph <g> from vertices <vs>
    # return vertices added to the graph
    def expand(g, vs):
      vs = extend(g, vs, words)
      extend(g, vs, vs)
      return vs
    
    # make a graph including the target vertices
    def make_graph(targets):
    
      # initial graphs and vertices
      (g, vs) = ([], [])
      for t in set(targets):
        g.append(defaultdict(set))
        vs.append(set([t]))
    
      # make the connected graph
      while len(g) > 1:
    
        # look for graphs that intersect
        for (i, j) in combinations(range(len(g)), 2):
          if set(g[i].keys()).intersection(g[j].keys()):
            #printf("coalesce i={i} j={j}")
            vs[i].update(vs[j])
            for (k, v) in g[j].items():
              g[i][k].update(v)
            extend(g[i], vs[i], vs[i])
            del vs[j]
            del g[j]
            break
        else:
          # all graphs are disjoint, so choose a graph to expand
          i = min(range(len(g)), key=lambda i: len(vs[i]))
          #printf("expanding graph {i} ...")
          n = len(g[i].keys())
          vs[i] = expand(g[i], vs[i])
          assert len(g[i].keys()) > n, 'targets cannot be connected'
    
      # return the connected graph
      return g[0]
    
    # generate paths in graph <g>
    # starting at <x> passing through shells <ss> from index <i>
    # path is accumulated in <p>
    def paths(g, x, ss, i, p):
      if i < len(ss):
        for y in g[x].intersection(ss[i]):
          yield from paths(g, y, ss, i + 1, p + [y])
      else:
        yield p
    
    # generate shortest path in graph <s>
    # starting from a vertex in <s>
    # finishing at a vertex in <t>
    # maximum path length is <m>
    def shortest(g, s, t, m):
    
      # initial source and target shells
      ss = [set(s)]
      ts = [set(t)]
    
      while True:
        # have the shells intersected?
        xs = ss[0].intersection(ts[0])
        if xs: break
    
        if len(ss) + len(ts) > m: return
    
        # expand the smaller set
        q = min((ss, ts), key=lambda x: len(x[0]))
        q.insert(0, set().union(*(graph[x] for x in q[0])).difference(*q[:2]))
    
      # construct paths from <s> to <t> via elements of <x>
      for x in xs:
        p1s = list(paths(g, x, ss, 1, []))
        p2s = list(paths(g, x, ts, 1, [x]))
        for p1 in p1s:
          p1.reverse()
          for p2 in p2s:
            yield p1 + p2
    
    
    # find a subset of <graph>
    # that links all <targets> to <trunk>
    # maximum graph size is <m>
    def solve(graph, targets, trunk, m):
    
      # are we done?
      if not targets:
        yield trunk
        return
    
      # collect shortest paths from each target to the trunk
      # and record targets involved in each (non-terminal) vertex
      vs = defaultdict(set)
      for t in targets:
        for p in shortest(graph, [t], trunk, m):
          for v in p[:-1]:
            vs[v].add(t)
    
      # find the most shared vertices
      n = max(len(ts) for (v, ts) in vs.items())
    
      if n == 1:
        # there are no shared vertices
        # so add in any shortest path for each target
        final = set(trunk)
        for t in targets:
          for p in shortest(graph, [t], final, m):
            final.update(p)
            break
        yield final
        return
    
      # otherwise, add in one of the best shared vertices
      for (v, ts) in vs.items():
        if len(ts) == n:
          for p in shortest(graph, [v], trunk, m):
            trunk2 = trunk.union(p)
            if not(len(trunk2) > m):
              yield from solve(graph, diff(targets, p), trunk2, m)
          
    
    # default targets / trunk
    targets = 'one two three four five six seven eight nine ten'
    trunk = None
    
    import sys
    if len(sys.argv) > 1:
      targets = sys.argv[1]
      trunk = None
    if len(sys.argv) > 2:
      trunk = sys.argv[2]
    
    targets = targets.strip().lower().split()
    trunk = (trunk.strip().lower().split() if trunk else [targets[0]])
    printf("targets={targets}, trunk={trunk}")
    
    
    # make the subgraph to work on
    graph = make_graph(targets + trunk)
    N = len(graph.keys())
    printf("graph has {N} vertices")
    
    # construct subgraphs
    r = Accumulator(fn=min, value=N)
    for g in solve(graph, diff(targets, trunk), set(trunk), r.value):
      n = len(g)
      r.accumulate_data(n, g)
      if n == r.value: printf("[size={n}: {g}]", g=' '.join(sorted(g)))
    
    printf("found graph of size {r.value}")
    

    You can provide a list of target words as the first command line argument, and an optional list of trunk words as the second command line argument (if no trunk words are given the first target word is used). (Note that the lists are space separated, so will probably need to be quoted to stop the shell from splitting them).

    Depending on the wordlist and the trunk specified (if any), I get run times of about 8s to 15s for targets of ONE, …, TEN. (Most of the time goes in the construction of the subgraph, rather than searching it, so I should probably improve that part of the implementation).

    I’d be interested to hear from anyone who has a program that generates a smallest possible subgraph in a reasonable time.

    If SEN is not excluded from the list of candidate words we can get a solution graph with 24 vertices. (So 14 link words). Although the solution found also includes the words FON and FONE.

    • Jim Randell 29 July 2015 at 2:50 pm

      Running the program over a list of “normal” words provided by Hugh Casement with 61,752 candidate words in produces a connected graph with 419 vertices and several connected subgraphs with 25 vertices (15 link words) in 3.4s. (A lot smaller than the 267,627 candidate words and connected graph with 1176 vertices produced by the Scrabble Words).

      Most of the solutions are similar to the diagram given above, except that TING is not in the word list, so THINE is always included.

      Interestingly one of the solutions uses: FIVE – FIE – FIX – SIX, rather than linking SIX via SEX into the TEN / SEVEN area, and FIVE via FINE into NINE.

      Here are the branches of the tree from that solution:

      [EIGHT] – NIGHT – THING – THINE – TINE – TONE – TOE – TOO – [TWO]
      [FIVE] – FIE – FOE – (TOE)
      [FOUR] – FOR – (FOE)
      [NINE] – (TINE)
      [ONE] – (TONE)
      [SEVEN] – SEEN – SEE – TEE – (TOE)
      [SIX] – FIX – (FIE)
      [TEN] – (TEE)
      [THREE] – THEE – (TEE)

      Enigma 288a - Solution Hugh

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: