Visual Basic for Applications/The Knuth String Shuffle

Summary edit

  • This code module includes both a Fisher-Yates character shuffling routine for strings, and a Durstenfeld-Knuth routine to shuffle one-dimensional arrays.
    • The procedure FisherYatesStrShuffle() shuffles characters of a single string. This is limited to shifting single characters around within one string.
    • Procedure KnuthArrShuffle() sorts the elements of a one-dimensional array. The procedure is limited only by what can be stored in array elements.
  • The two methods are pseudo-random and bias-free. Elsewhere, the use of a random generator does not necessarily guarantee that the results will be free from bias.
  • The code can work in any of the MS Office applications that support VBA.

Code Notes edit

  • The Fisher-Yates shuffle applies a pseudo random selection method. It is described here for characters in a string but a related method, the Durstenfeld-Knuth method is preferred for arrays.
    • Taking each element of a string in sequence for repositioning leaves one end of the result string badly biased. The Knuth algorithm instead proposes a random position within the string. The element at that position is then accumulated into the output and removed from the original. Subsequent selections are made in the same way from the ever shortened string.
    • Note that there is still the possibility of a given character being unmoved in the process, but only within expectation.
    • Set the number of strings required with variable Cycles in the top procedure. The Immediate Window has proved the best place for display and copying.
    • It should be pointed out that any attempt to avoid the unmoved elements, will not only change the random nature of the shuffle but prevent the use of other than non-repeat strings. That is to say strings with repeated characters could not then be shuffled.
  • The Durstenfeld-Knuth method for arrays differs only slightly from that of the Fisher-Yates implementation.
    • To reduce processing, and no doubt to overcome the burden of removing an element from the middle of an array during shortening, the algorithm instead overwrites the element selected for output with the last element. In this VBA implementation the array is then conveniently shortened by one element with Redim Preserve.
  • See Fisher Yates Shuffle for a good description of both methods.

The VBA Code Module edit

Copy all of the code below into say, an MS Excel standard module, save the workbook as an xlsm file type, and run either of the test procedures to test the requisite code. Be sure to open the Immediate Window for output.

Option Explicit

Private Sub testFisherYatesStrShuffle()
    'run this to test the string shuffle
    
    Dim bOK As Boolean, sStr As String, sR As String
    Dim sOut As String, n As Long, Cycles As Long
    
    'set number of shuffled versions needed
    Cycles = 1
    
    'test string
    sStr = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"

    For n = 1 To Cycles
        bOK = FisherYatesStrShuffle(sStr, sR)
        sOut = sR
        
        If bOK = False Then
           MsgBox "Problems in shuffle"
           Exit Sub
        End If
        
        'output to message box and immediate window
        'MsgBox "Before : " & sStr & vbCrLf & _
               "After    : " & sR
        Debug.Print "Before : " & sStr
        Debug.Print "After  : " & sOut & vbCrLf
    Next n
    
End Sub

Private Function FisherYatesStrShuffle(ByVal sIn As String, sOut As String) As Boolean
    'Performs a naive Fisher-Yates shuffle on the input string.
    'Returns result in sOut. Pseudo random character sequencing.
    
    'Note: Some or all elements could occupy their original positions,
    'but only in accordance with expectation based on the random generator.
    'This can be seen best for very short character strings, like "ABC".
            
    Dim sL As String, sR As String, sT1 As String, sT2 As String, sMod As String
    Dim sAcc As String, i As Long, j As Long, nL As Long, n As Long
        
    'check input string
    If sIn = "" Or Len(sIn) < 2 Then
       MsgBox "At least 2 characters needed - closing"
       Exit Function
    End If
        
    'initial assignments
    nL = Len(sIn)
    sMod = sIn
    
    Randomize
    For i = 1 To Len(sIn)
        'first get a random number
        j = Int((nL - 1 + 1) * Rnd + 1)
            
        'find string value of jth element
        sT1 = Mid$(sMod, j, 1)
        DoEvents 'allow break
                
        'accumulate jth element
        sAcc = sAcc & sT1
        
        'remove current character
        sL = Left$(sMod, j - 1)
        sR = Right$(sMod, nL - j)
        sMod = sL & sR
        
        'new string length
        nL = Len(sMod)
        DoEvents 'allow break
    Next i

    'transfer
    sOut = sAcc
    
    FisherYatesStrShuffle = True

End Function

Private Sub testKnuthArrShuffle()
    'run this to test the array shuffle
    
    Dim bOK As Boolean, sStr As String, sOut As String
    Dim Cycles As Long, n As Long, bF As Boolean
    Dim vS As Variant, vA As Variant, vB As Variant
           
    'define a typical array for shuffling
    vS = Array("A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", _
               "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", _
               "0", "1", "2", "3", "4", "5", "6", "7", "8", "9")
    
    'set number of shuffled versions needed
    Cycles = 1
    
    For n = 1 To Cycles
    
        'shuffle array
        bOK = KnuthArrShuffle(vS, vA)
                
        If bOK = False Then
           MsgBox "Problems in array shuffle"
           Exit Sub
        End If
            
        'arrays to strings for display
        sStr = Arr1DToStr2(vS)
        sOut = Arr1DToStr2(vA)
        
        'display
    '    MsgBox "Before : " & sStr & vbCrLf & _
    '           "After    : " & sOut
        Debug.Print "Before : " & sStr
        Debug.Print "After  : " & sOut & vbCrLf
           
        'return to an array in vB if needed
        bF = StrTo1DArr2(sOut, vB)

    Next n

End Sub

Private Function KnuthArrShuffle(vIn As Variant, vR As Variant) As Boolean
    ' Performs a modified Knuth random shuffle on the elements of the input array.
    ' The original by Fisher-Yates, was modified for computers by Durstenfeld
    ' then popularised by Knuth. Returns result in vR with vIn unchanged.
       
    'Note: Some or all elements COULD occupy their original positions,
    'but only in accordance with expectation based on the random generator.
    'This is best seen for small arrays, say with only 3 elements or so.
        
    Dim vW As Variant
    Dim LB As Long, UB As Long, nL As Long
    Dim i As Long, j As Long
    
    'initial assignments
    LB = LBound(vIn): UB = UBound(vIn)
    ReDim vR(LB To UB) 'return array
    ReDim vW(LB To UB) 'work array
    nL = UB - LB + 1   'length of input array
    vW = vIn 'transfer to a work array
            
    'working
    Randomize
    For i = LB To nL
        'first get a random number
        j = Int((UB - LB + 1) * Rnd + LB)
            
        'transfer jth of vW to ith of vR
        vR(i) = vW(j)
        
        'replace selection with current last of vW
        vW(j) = vW(UB)
        
        'remove last of vW by shortening array
        ReDim Preserve vW(LB To UB - 1)
        
        'get new UBound of shortened vW
        UB = UBound(vW)
        
        'exception; return if last chara
        If UB = LB Then
            vR(i + 1) = vW(UB)
            Exit For
        End If
                
        DoEvents 'allow breaks
    Next i
        
    KnuthArrShuffle = True

End Function
Function StrTo1DArr2(ByVal sIn As String, vRet As Variant, _
                    Optional ByVal bLB1 As Boolean = True) As Boolean
    ' Loads string characters into 1D array (vRet). One per element.
    ' Optional choice of lower bound. bLB1 = True for one-based (default),
    ' else bLB1 = False for zero-based. vRet dimensioned in proc.

    Dim nC As Long, sT As String
    Dim LB As Long, UB As Long
    
    If sIn = "" Then
        MsgBox "Empty string - closing"
        Exit Function
    End If
    
    'allocate array for chosen lower bound
    If bLB1 = True Then
        ReDim vRet(1 To Len(sIn))
    Else
        ReDim vRet(0 To Len(sIn) - 1)
    End If
    LB = LBound(vRet): UB = UBound(vRet)

    'load charas of string into array
    For nC = LB To UB
        If bLB1 = True Then
            sT = Mid$(sIn, nC, 1)
        Else
            sT = Mid$(sIn, nC + 1, 1)
        End If
        vRet(nC) = sT
    Next

    StrTo1DArr2 = True

End Function
    
Function Arr1DToStr2(vIn As Variant) As String
    ' Makes a single string from 1D array string elements.
    ' Works for any array bounds.
        
    Dim nC As Long, sT As String, sAccum As String
    Dim LB As Long, UB As Long
    
    LB = LBound(vIn): UB = UBound(vIn)

    'join characters of array into string
    For nC = LB To UB
        sT = vIn(nC)
        sAccum = sAccum & sT
    Next

    Arr1DToStr2 = sAccum

End Function

See Also edit

  • Fisher Yates Shuffle: A very clearly written article in Wikipedia, that explains worked examoles step by step.

External Links edit