Soon after I saw Nate's Word Galaxy Generator, I started thinking really nerdy thoughts.

For every set of words, there must be a set of all possible word galaxies, but some galaxies are "better" than others. In fact, there must be one or more "optimal" word galaxies.

So what's an "optimal" word galaxy? That'd be the word galaxy composed of a given set of words in which the most possible letters are shared, crossword puzzle style, or another way of stating it, the word galaxy which can be drawn in the fewest possible grid squares.

It's easy to see that at least one optimal word galaxy must exist for every set of words, but it's not so easy to see how to find it, or how to prove that a given galaxy is in fact the optimal galaxy.

Well, there's the Infinite Monkeys Algorithm of trying each word in every possible position and orientation, until you've examined all possible word galaxies and found the best one. There might be a more reasonable algorithm, but if there is, I don't know what it is.

Nate's Word galaxy generator, (the one I found by searching Google anyway) appears to my untrained Perl-challenged eye to use the Hundred Monkeys Algorithm, it just tries 100 random places for each word, and does not pretend to find the optimal word galaxy.


Here, I've chosen "optimal" to mean "most compact" and I've chosen to disregard any meaning which might be assigned to words that make up the galaxies. If you allow the meanings of words and their arrangements to influence your decision of which galaxy is "better" than another, then it is nonsense to talk of finding an "optimal" galaxy, of course. You can imagine that there might exist spectacular esamples of galaxies which densely pack the grid and produce an entire story with all words intersecting like crazy.
Update 11:05pm CST, Nov 16, 2000

I wrote a C program, surprise surprise, to find optimal word galaxies. But, for anything but the smallest galaxies, you'd better have some time on your hands. Some optimization is done, but there are no doubt more tricks you can pull to speed it up. I leave this as an exercise for the reader. (Heh, I've always wanted to do that)

My algorithm is, I think, the same order as an algorithm which tries all combinations, which is to say, O(2n) where n is the number of words. As flyingroc notes, by the time it finishes you might well be dead. However, it does manage to exclude testing a quite a few combinations in what I like to pretend is a clever way. Update: Nov 18, 2000 I've thought of another better algorithm, but I'm not all that interested in implementing it right now, (maybe later). It would go something like this: For a given list of words, find all the possible intersections, then start trying all the combinations of intersections all at once and test to see if each combination is a valid galaxy. If none are, take one intersection out, try all combinations of the remaining intersections. If none work, take out a different intersection...etc. Then try taking out all combos of 2 intersections, then 3, etc. The first valid galaxy you find should be an optimal one. I'm not sure what the performance characteristics of this algorithm would be, but I suspect it would be better than O(2n).

But anyway, here is my O(2n) C implementation:


#include <stdio.h>
#include <malloc.h>


#define RIGHT 0
#define DOWN 1

struct grid_t {
        char *grid;
        int x, y;
};

struct word_placement_t {
        int x, y, orientation;
        char *word;
};

struct word_galaxy_t {
        struct word_placement_t *wp;
        int nwords;
        int score;
};


struct grid_t *
make_grid ( char *wordlist[] )
{
        struct grid_t *rval;

        int maxlen=0, len;
        char **w;

        for (w=wordlist; *w; w++)
        {

                len = strlen(*w);
                if (len > maxlen) maxlen = len;
        }


        rval = malloc (sizeof(struct grid_t));
        if (rval == NULL) return rval;

        rval->x = maxlen*2;
        rval->y = maxlen*2;
        rval->grid = (char *) malloc ( rval->x * rval->y);

        return rval;
}

int
count_words(char **wordlist)
{
        int count=0;
        char **w;

        for (w=wordlist; *w != NULL;  w++) count++;
        return count;
}

void
init_grid(struct grid_t *grid)
{
        int i, limit;

        limit = grid->x * grid->y;

        for (i=0;i<limit;i++)
                grid->grid[i] = '.';

}

struct word_galaxy_t *
init_galaxy(char *wordlist[], int init_score)
{
        struct word_galaxy_t *wg;
        int nwords, i;

        nwords = count_words(wordlist);
        printf("nwords=$d\n", nwords);

        wg = (struct word_galaxy_t *)
                malloc(sizeof(struct word_galaxy_t));

        if (wg == NULL) return wg;


        wg->wp = (struct word_placement_t *)
                malloc (sizeof (struct word_placement_t) * nwords);

        for (i=0;i<nwords;i++)
        {
                printf("%s\n", wordlist[i]);
                wg->wp[i].word = wordlist[i];
                wg->wp[i].x = 0;
                wg->wp[i].y = 0;
                wg->wp[i].orientation = RIGHT;
        }

        wg->nwords = nwords;
        wg->score = init_score;
        return(wg);
}


int
score_grid(struct grid_t *grid)
{
        int score=0;
        char *c;
        int i, limit;

        c = grid->grid;
        limit = grid->x * grid->y;

        for (i=0;i<limit;i++)
        {
                if (*c != 0) score++;
                c++;
        }

        return(score);
}

void
print_grid(struct grid_t *grid)
{
        int x,y;
        char *c;

        c = grid->grid;

        for (y = 0; y< grid->y; y++)
        {
                for (x=0;x<grid->x;x++)
                {
                        printf("%c", *c == 0 ? '.' : *c);
                        c++;
                }
                printf("\n");
        }
        printf("------------------------- %d\n",
                score_grid(grid));
}

int
place_word(struct word_placement_t *wp,
        struct grid_t *grid)
{
        int dx=0, dy=0;
        int x, y;
        char *c, *d;

        if (wp->orientation == RIGHT)
                dx=1;
        else
                dy=grid->x;

        c = wp->word;
        x = wp->x;
        y = wp->y;
        d = grid->grid + (y*grid->x)+x;

        if (dx >0 && x + strlen(c) > grid->x) return -1;
        if (dy >0 && y + strlen(c) > grid->y) return -1;
        while (*c)
        {
                if ((*d != 0) && (*d != *c))
                        return(-1);

                *d = *c;
                d += (dx + dy);
                c++;
        }
        return(0);
}

int
is_invalid_galaxy(struct word_galaxy_t *wg,
                struct grid_t *grid)
{
        int i;

        /* init_grid(grid); */
        bzero(grid->grid, grid->x * grid->y);

        for (i=wg->nwords-1;i>=0;i--)
        {
                if (place_word( &wg->wp[i], grid ) != 0)
                        return i; /* invalid on this word */
        }
        return -1;  /* not invalid */

}

void
zero_lesser(struct word_galaxy_t *wg, int word)
{
        int i;

        for (i=0;i<word-1;i++)
        {
                wg->wp[i].x = 0;
                wg->wp[i].y = 0;
                wg->wp[i].orientation = RIGHT;
        }
}

int
advance_galaxy(struct word_galaxy_t *wg, int n, int xmax, int ymax)
{
        if (wg->wp[n].orientation == RIGHT)
        {
                wg->wp[n].orientation = DOWN;
                return 0;
        }

        wg->wp[n].orientation = RIGHT;

        wg->wp[n].x++;
        if (wg->wp[n].x < xmax)
                return 0;

        wg->wp[n].x = 0;
        wg->wp[n].y++;
        if (wg->wp[n].y < ymax)
                return 0;


        wg->wp[n].y = 0;
        if (n >= wg->nwords-1)
                return 1;
        advance_galaxy(wg, n+1, xmax, ymax);
        return 0;

}

find_best_galaxy( struct word_galaxy_t *best_galaxy,
                struct word_galaxy_t *current_galaxy,
                struct grid_t *grid,
                char **wordlist)
{
        int i;
        int new_score;
        int done;
        int theword;
        int count=0;

        do
        {
                theword = is_invalid_galaxy(current_galaxy, grid);
                if (theword == -1)
                {
                        if (best_galaxy->score == 0)
                        {
                                printf("barf!\n");
                                exit(1);
                        }
                        new_score = score_grid(grid);
                        if (new_score <= best_galaxy->score)
                        {
                                *best_galaxy = *current_galaxy;
                                best_galaxy->score = new_score;
                                print_grid(grid);
                        }
                }
                done = advance_galaxy(current_galaxy,
                        (theword == -1) ? 0 : theword,
                                grid->x, grid->y);
                if (theword > 0)
                        zero_lesser(current_galaxy, theword-1);
                count++;
        } while (!done);
}

int
main(int argc, char *argv[])
{

        struct word_galaxy_t *best_galaxy, *current_galaxy;
        struct grid_t *grid;
        char **wordlist;
        int i;


        wordlist = (char **) malloc(sizeof(char *) * (argc));
        for (i=1;i<argc;i++)
        {
                wordlist[i-1] = argv[i];
        }
        wordlist[argc-1] = NULL;

        printf("owg\n");
        grid = make_grid(wordlist);
        init_grid(grid);
        printf("a\n");

        best_galaxy = init_galaxy(wordlist, grid->x * grid->y);
        printf("b\n");
        current_galaxy = init_galaxy(wordlist, grid->x * grid->y);
        printf("c\n");


        find_best_galaxy(best_galaxy, current_galaxy, grid, wordlist);

}



Litmus, my program finds your galaxy rather quickly, a cut-n-paste from its output:

...
------------------------- 12
.boat...
.o..u...
.o..b...
.tome...
........
........
........
........
But, you do bring up an excellent point, you might define "optimal word galaxy" as finding the optimal words that fit a given shape. In other words, using an entire dictionary instead of a list of words, and given a shape that words must fit into, find the words. A completely different problem.

How to determine if optimal word galaxy has a reasonable algorithm

First, determine whether optimal word galaxy is in NP. Ie, given a word galaxy, can you determine if it is optimal in polynomial time? If your answer is "no" then you're dead. About the only way you can find the optimal word galaxy is to list out all possible combinations.

Only if you can find an algorithm that can verify the solution in polynomial time should you even try to proceed to find a polynomial time algorithm to solve the problem. Or prove that the problem is np-complete.

Some intuition tells me, though, that the problem is not in NP. I leave it to those smarter than me to prove (or disprove) my claim.

In my mind, the optimal word galaxy would be the one that I have not been able to find yet, ie:

BOOT

O__U

A__B

TOME

The perfect square word galaxy!

Even clicking explore with the words boot, tome, boat, and tube repetedly doesn't make the optimal word galaxy.

Sigh, I'll just keep on trying...

Log in or register to write something here or to contact authors.