Social Detour - View Essay

I'm not one to do many puzzles by hand. The Rubik's cube was a puzzle that I couldn't let beat me, but aside from that, crossword puzzles, cryptoquotes, video games (with few exceptions), and Sudoku puzzles don't yield enough psychological reward for me to work so tediously on them. Sudoku did capture my interest as being an awesome database puzzle. Yes, they've been solved using SQL by others, but mine is the only solution I'm aware of that involves set matching and cross-row lookahead/lookbehinds in a single statement.

The other solutions that I've seen were either largely procedural or required lots of loops: http://www.vsj.co.uk/articles/display.asp?id=540 , http://www.sqlmag.com/Articles/ArticleID/47752/pg/2/2.html , http://technology.amis.nl/blog/?p=2066 -- this one does it in a single statement, but appears to utilize a proprietary declarative looping construct.

The only true other single statement solution that I've seen is on Joe Celko's site, solved by Richard Romley: http://www.celko.com/sudoku.txt (linked from http://www.celko.com/puzzles.htm -- Joe has a little introduction on it there. I think Richard's solution is absolutely brilliant.) On another side note, I also have three pages of fame in Joe's latest SQL Puzzles and Answers book, from pages 41 through 43 for my solution to his "Wages of Sin" puzzle.

Richard followed a vastly different approach from me; I haven't tested his solution, but I imagine that it would outperform mine given present-day RDBMS implementations. Whereas I followed a set-matching/elimination algorithm that utilized a set of computed combinations, Richard took the approach of an elimination-style algorithm; something I may have thought of in doing a 2x2 puzzle, but it never occurred to me for a 9x9 w/ 3x3 cells.

Breaking SQL Server and SQL Management Studio

SQL Server's query optimizer produced a 2600+ row execution plan for my query. SQL Management studio's Execution Plan visualizer navigation widget (the little map-like thing in the lower right-hand corner of the Execution Plan) is unable to handle it and causes GDI to crash.

Thankfully it doesn't crash Management Studio. Which, by the way, is my least favorite thing about SQL Server 2005. It's better than Enterprise Manager was in SQL Server 2000, but, aside from the text-editor features, is a huge step away from the niceties of Query Analyzer. The large memory footprint and lack of stability make it a poor tool for working on servers that are hard at work crunching data. Call me an elitist, but I believe that if you're going to be working with databases, you should do it through SQL. Using graphical tools for database design is like watching a movie instead of reading a book -- it forces you to use somebody else's interpretation of how things should be. Solving this type of problem using the GUI tools is absolutely impossible.

My machine is unable to run this as a single statement. At first, it would cause my CPU to overheat and my machine would just turn off. I had to break it down into multiple steps that perform the same action as the query's common-table-expressions (this step-based solution is included in the .zip file). Microsoft: this may be a useful query for testing and improving your optimizer. It's also useful for database benchmarking as it utilizes heavy CPU and IO operations.

What's in the .zip file

Disclaimer: All software is provided as-is. If you execute it and it causes your tempdb to grow to fill up your hard drive, overheats your CPU, et cetera, consider yourself warned. I make no warranties; run this stuff at your own risk.

SudokuRender -- a small .Net 3.5 WPF application that's capable of rendering your Sudoku result sets in a Sudoku-style grid. It improves readibility. You may need to update the connectionstring in the source code. I wrote this as a "programmer's tool" to help me confirm the validity of my results.
unpivoted.txt.pipe -- The pipe-delimitted file containing a handful of Sudoku problems.
sudoku_master.sql -- This script creates the Sudoku database and populates the set-matching tables. It takes several minutes to run on my old P4 hyperthreaded CPU w/ 1GB RAM and a 10kRPM Western Digital Raptor hard drive. You will need to update the path for the unpivoted.txt.pipe file in this .sql file for the bulk insert to work.
sudoku_solution.sql -- This is the query that my machine can't run, but it parses and I'm convinced that it would return the correct solution based on tests that I've performed; those tests are contained within sudoku_solution_step_by_step.sql
sudoku_solution_step_by_step.sql -- This contains the same queries as used in sudoku_solution.sql, only broken out into individual steps. SQL Server seems to do a much better job digesting this script than sudoku_solution.sql

How it all works

I'm still writing this...be patient

Below I catalog the different common table expressions that are used within the problem:

Expression Name	What it represents
Puzzle	The set of known values for the current Sudoku puzzle
RowSolutions	All valid state sets that match the known values in each row of the puzzle.
ColumnSolutions	All valid state sets that match the known values in each column of the puzzle.
CellSolutions	All valid state sets that match the known values in each 3x3 cell of the puzzle.
CompleteRowMatches	All valid state sets from the rows within the puzzle that are also compatible with the values in Column Solutions. In other words, RowSolutions may still contain a cell with a value that would conflict with the known value in another column. This expression eliminates those.
CompleteColumnMatches	All valid state sets from the columns within the puzzle, further limited by restrictions from the only valid allowable values from CompleteRowMatches.
MatchStateValues	The actual values associated with the sets computed in CompleteColumnMatches
CellMatchStateSets	Uses MatchStateValues to determine the statesets that work with the 3x3 cells
CellMatchStateSetValues	The remaining values, further limited
RowCellSets	The allowable combinations of 3, 3x3 cells in rows of three
ColCellSets	The allowable combinations of 3, 3x3 cells in columns of three

The final query of this is where everything comes together. The subquery that starts with "SELECT TOP 1 ..." would give us all possible solutions to the given Sudoku problem as a set of cell statesets. SELECT TOP 1 gives us one possible solution, which is all that we really care about. All of those self-joins determine set-compatibility. The outer query then gives us the distinct values in row-column-value sets.

This solution is quite unique in that it performs set aggregation across sets, utilizing a second degree of set logic. Instead of standard row-based relational operations, or comparison of rows within sets, this solution relies on comparisons of sets within sets. The first common-table expression returns sets of divisible sets, rather than performing a simple division operation based on known values within rows, it handles sets with missing values to determine compatibility.

SQL Source:

-- ## BEGIN DATABASE SCHEMA ## --

USE tempdb

GO

SET NOCOUNT ON

GO

IF DB_ID('Sudoku') IS NOT NULL BEGIN
        ALTER DATABASE Sudoku SET SINGLE_USER WITH ROLLBACK IMMEDIATE
        DROP DATABASE Sudoku
END

GO

CREATE DATABASE Sudoku

GO

USE Sudoku

GO

ALTER DATABASE Sudoku SET RECOVERY SIMPLE

GO

CREATE TABLE StateSets
(
        StateSet        INT
                        NOT NULL
,       Position        TINYINT
                        NOT NULL
                        CHECK (Position BETWEEN 1 AND 9)
,       Val             TINYINT
                        NOT NULL
                        CHECK (Val BETWEEN 1 AND 9)
,       PRIMARY KEY (StateSet, Position)
,       UNIQUE(StateSet, Val)
)

GO

CREATE INDEX IX_Position_Val ON StateSets (Position, Val, StateSet)

GO

CREATE TABLE Puzzles
(
        PuzzleGUID      UNIQUEIDENTIFIER
,       Row             TINYINT
                        NOT NULL
                        CHECK (Row BETWEEN 1 AND 9)
,       Col             TINYINT
                        NOT NULL
                        CHECK (Col BETWEEN 1 AND 9)
,       Cell AS (Row - 1) / 3 * 3 + (Col - 1) / 3 + 1 PERSISTED
,       CellPosition    AS ((Row - 1) % 3) * 3 + ((Col - 1) % 3 + 1) PERSISTED
,       Val             TINYINT
                        NOT NULL
                        CHECK (Val BETWEEN 1 AND 9)
,       PRIMARY KEY (PuzzleGUID, Row, Col, Cell)
,       UNIQUE (PuzzleGUID, Row, Val)
,       UNIQUE (PuzzleGUID, Col, Val)
,       UNIQUE (PuzzleGUID, Cell, Val)
,       UNIQUE (PuzzleGUID, Row, Col)
)

GO

CREATE TABLE #SudokuLoad
(
        PuzzleGUID UNIQUEIDENTIFIER
,       Row     INT
,       Col     INT
,       Val     INT
)
BULK INSERT #SudokuLoad
        FROM 'D:\Sudoku\unpivoted.txt.pipe' 
        WITH (ROWTERMINATOR = '\n', FIELDTERMINATOR = '|')
        
INSERT Puzzles (PuzzleGUID, Row, Col, Val)
SELECT * FROM #SudokuLoad
DROP TABLE #SudokuLoad

GO
-- nums = range(1,10)
DECLARE @States TABLE
(
        StateSet        INT
                        IDENTITY -- IDENTITY is fine here, table variable
                        PRIMARY KEY
,       P1              INT
,       P2              INT
,       P3              INT
,       P4              INT
,       P5              INT
,       P6              INT
,       P7              INT
,       P8              INT
,       P9              INT
)

DECLARE @V TABLE
(
        V       INT
)

DECLARE @I INT
SET @I = 1
WHILE @I <= 9 BEGIN
        INSERT @V
        SELECT @I
        SET @I = @I + 1
END

CREATE TABLE StateValues
(
        StateValue TINYINT PRIMARY KEY
)

INSERT StateValues SELECT V FROM @V

INSERT @States(P1,P2,P3,P4,P5,P6,P7,P8,P9)
SELECT  t1.V
,       t2.V
,       t3.V
,       t4.V
,       t5.V
,       t6.V
,       t7.V
,       t8.V
,       t9.V
        FROM @V t1,@V t2,@V t3,@V t4,@V t5,@V t6,@V t7,@V t8,@V t9
        --print '%s=%d' % (reduce (lambda a, b : '%s*%s'%(a, b), ['t%s.V'%n for n in nums]), reduce(lambda a, b:a*b, nums))
        WHERE t1.V*t2.V*t3.V*t4.V*t5.V*t6.V*t7.V*t8.V*t9.V=362880
          AND (t1.V+1)*(t2.V+1)*(t3.V+1)*(t4.V+1)*(t5.V+1)*(t6.V+1)*(t7.V+1)*(t8.V+1)*(t9.V+1)=3628800
        ORDER BY t1.V, t2.V, t3.V, t4.V, t5.V, t6.V, t7.V, t8.V, t9.V
          
SELECT * 
        INTO #States
        FROM @States

GO

INSERT StateSets
(
        StateSet
,       Position
,       Val
)
SELECT  StateSet
,       REPLACE(Position, 'P', '')
,       Val
        FROM #states
        UNPIVOT (Val FOR Position IN (P1, P2, P3, P4, P5, P6, P7, P8, P9))n
        
GO

-- ### BEGIN SOLUTION ### --

WITH Puzzle AS
(
        SELECT  Cell
        ,       CellPosition
        ,       Row     
        ,       Col
        ,       Val
                FROM Puzzles
                WHERE PuzzleGUID LIKE '4877%'
)
,       RowSolutions AS
(
        SELECT  RowStateSet
        ,       Row
        ,       Position
        ,       Val
                FROM (SELECT    StateSet RowStateSet 
                ,       Row
                        FROM Puzzle
                                inner join StateSets
                                        ON Position = Col
                                        AND Puzzle.Val = StateSets.Val
                                inner join StateValues
                                        ON Row = StateValue
                        GROUP BY StateSet, Row, Statevalue
                        HAVING COUNT(*) = (SELECT COUNT(*) FROM Puzzle WHERE Row = StateValue)
                ) a
                        INNER JOIN StateSets
                                ON RowStateSet = StateSet
)       ,
ColumnSolutions AS
(
        SELECT  ColStateSet
        ,       Col
        ,       Position
        ,       Val
                FROM (SELECT    StateSet ColStateSet
                ,       Col
                        FROM Puzzle
                                inner join StateSets
                                        ON Position = Row
                                        AND Puzzle.Val = StateSets.Val
                                inner join StateValues
                                        ON Col = StateValue
                        GROUP BY StateSet, Col, Statevalue
                        HAVING COUNT(*) = (SELECT COUNT(*) FROM Puzzle WHERE Col = StateValue)
                ) a
                        INNER JOIN StateSets
                                ON ColStateSet = StateSet
)       ,
CellSolutions AS
(
        SELECT  CellStateSet
        ,       Cell
        ,       Position
        ,       Val
                FROM (SELECT    StateSet CellStateSet
                ,       Cell
                        FROM Puzzle
                                inner join StateSets
                                        ON Position = CellPosition
                                        AND Puzzle.Val = StateSets.Val
                                inner join StateValues
                                        ON Cell = StateValue
                        GROUP BY StateSet, Cell, StateValue
                        HAVING COUNT(*) = (SELECT COUNT(*) FROM Puzzle WHERE Cell = StateValue)
                ) a
                        INNER JOIN StateSets
                                ON CellStateSet = StateSet
)       ,
CompleteRowMatches      AS
(
        SELECT  RowStateSet
        ,       Row 
                FROM RowSolutions
                        INNER JOIN ColumnSolutions
                                ON Row = ColumnSolutions.Position
                                AND Col = RowSolutions.Position
                                AND RowSolutions.Val = ColumnSolutions.Val
                GROUP BY RowStateSet, Row
                HAVING COUNT(DISTINCT Col) = 9
)       ,
CompleteColumnMatches   AS
(
        SELECT ColStateSet, Col
                FROM RowSolutions
                        INNER JOIN ColumnSolutions
                                ON Row = ColumnSolutions.Position
                                AND Col = RowSolutions.Position
                                AND RowSolutions.Val = ColumnSolutions.Val
                        INNER JOIN CompleteRowMatches
                                ON RowSolutions.Row = CompleteRowMatches.Row
                                AND RowSolutions.RowStateSet = CompleteRowMatches.RowStateSet
                GROUP BY ColStateSet, Col
                HAVING COUNT(DISTINCT RowSolutions.Row) = 9
)       ,
MatchStateValues        AS
(
        SELECT  ColStateSet StateSet
        ,       Col
        ,       Position Row
        ,       (Position - 1) / 3 * 3 + (Col - 1) / 3 + 1 Cell
        ,       ((Position - 1) % 3) * 3 + ((Col - 1) % 3 + 1) CellPos
        ,       Val
                FROM CompleteColumnMatches
                        INNER JOIN StateSets
                                ON ColStateSet = StateSet
)       ,
CellMatchStateSets      AS
(
        SELECT  CellSolutions.CellStateSet
        ,       CellSolutions.Cell
                FROM MatchStateValues 
                        INNER JOIN CellSolutions
                                ON MatchStateValues.Cell = CellSolutions.Cell
                                AND MatchStateValues.CellPos = CellSolutions.Position
                                AND MatchStateValues.Val = CellSolutions.Val
                GROUP BY CellSolutions.CellStateSet, CellSolutions.Cell
                HAVING COUNT(DISTINCT Position) = 9
)       ,
CellMatchStateSetValues AS
(
        SELECT  StateSets.*
        ,       (Position + 2) % 3 + ((cell + 2) % 3) * 3 + 1 Col
        ,       (Position - 1) / 3 + ((cell - 1) / 3) * 3 + 1 Row
        ,       Cell
                FROM StateSets
                        INNER JOIN CellMatchStateSets
                                ON StateSet = CellStateSet
)       ,
RowCellSets     AS
(
        SELECT a.CellStateSet   ASet
        ,       a.Cell          ACell
        ,       b.CellStateSet  BSet
        ,       b.Cell          BCell
        ,       c.CellStateSet  CSet
        ,       c.Cell          CCell
                FROM    CellMatchStateSets a
                ,       CellMatchStateSets b
                ,       CellMatchStateSets c
                WHERE   a.Cell % 3 = 1
                AND     b.Cell % 3 = 2
                AND     c.Cell % 3 = 0
                AND     b.Cell BETWEEN a.Cell AND c.Cell
                AND     c.Cell - a.Cell = 2
                AND     EXISTS
                (
                        SELECT  *
                                FROM
                                (
                                        SELECT  Row
                                                FROM CellMatchStateSetValues
                                                WHERE   (
                                                                StateSet = a.CellStateSet
                                                                AND Cell = a.Cell
                                                        )
                                                        OR
                                                        (
                                                                StateSet = b.CellStateSet
                                                                AND Cell = b.Cell
                                                        )
                                                        OR
                                                        (
                                                                StateSet = c.CellStateSet
                                                                AND Cell = c.Cell
                                                        )
                                                GROUP BY Row
                                                HAVING SUM(DISTINCT Val) = 45
                                ) y
                                HAVING COUNT(DISTINCT Row) = 3
                )
)       ,
ColCellSets     AS
(
        SELECT a.CellStateSet   ASet
        ,       a.Cell          ACell
        ,       b.CellStateSet  BSet
        ,       b.Cell          BCell
        ,       c.CellStateSet  CSet
        ,       c.Cell          CCell
                FROM    CellMatchStateSets a
                ,       CellMatchStateSets b
                ,       CellMatchStateSets c
                WHERE   a.Cell <= 3
                AND     c.Cell - b.Cell = 3
                AND     c.Cell - a.Cell = 6
                AND     EXISTS
                (
                        SELECT  *
                                FROM
                                (
                                        SELECT  Col
                                                FROM CellMatchStateSetValues
                                                WHERE   (
                                                                StateSet = a.CellStateSet
                                                                AND Cell = a.Cell
                                                        )
                                                        OR
                                                        (
                                                                StateSet = b.CellStateSet
                                                                AND Cell = b.Cell
                                                        )
                                                        OR
                                                        (
                                                                StateSet = c.CellStateSet
                                                                AND Cell = c.Cell
                                                        )
                                                GROUP BY Col
                                                HAVING SUM(DISTINCT Val) = 45
                                ) y
                                HAVING COUNT(DISTINCT Col) = 3
                )       
)       
SELECT DISTINCT Row
,       Col
,       Val 
        FROM CellMatchStateSetValues
        WHERE EXISTS (
        SELECT DISTINCT *
                FROM 
                (
                        SELECT  TOP 1
                                cA.ASet Cell1
                        ,       cA.BSet Cell4
                        ,       cA.CSet Cell7
                        ,       rA.BSet Cell2
                        ,       rA.CSet Cell3
                        ,       cB.BSet Cell5
                        ,       cB.CSet Cell8
                        ,       rB.CSet Cell6
                        ,       rC.CSet Cell9
                                FROM ColCellSets cA
                                        INNER JOIN RowCellSets rA
                                                ON cA.ASet = rA.ASet
                                                AND cA.ACell = 1
                                                AND rA.ACell = 1
                                        INNER JOIN ColCellSets cB
                                                ON rA.BSet = cB.ASet
                                                AND cB.ACell = 2
                                        INNER JOIN RowCellSets rB
                                                ON cA.BSet = rB.ASet
                                                AND cB.BSet = rB.BSet
                                                AND rB.ACell = 4
                                        INNER JOIN ColCellSets cC
                                                ON rA.CSet = cC.ASet
                                                AND rB.CSet = cC.BSet
                                                AND cC.ACell = 3
                                        INNER JOIN RowCellSets rC
                                                ON rC.ASet = cA.CSet
                                                AND rC.BSet = cB.CSet
                                                AND rC.CSet = cC.CSet
                                                AND rC.ACell = 7
                ) s UNPIVOT (StateSet FOR Cell IN (Cell1, Cell2, Cell3, Cell4, Cell5, Cell6, Cell7, Cell8, Cell9)) x
                WHERE CellMatchStateSetValues.StateSet = x.StateSet
                AND RTRIM(x.Cell) = RTRIM('Cell' + CAST(CellMatchStateSetValues.Cell AS VARCHAR(10)))
        )

GO

Cannot Comment

No comments have been posted

Cannot Comment

Our Staff | Terms of Service

Copyright© 2001 - 2025 - Alan Ashton - All Rights Reserved
Site designed and written by Alan Ashton. (View my profile [User: Alan])
Powered By Microsoft ASP.NET, SQL Server, Windows Server 2003, Apple OSX, Python, Ruby and me.
AIM: AntaresUSF
Email: alanashton@socialdetour.com