Saturday, July 26, 2014

Slugify a tile in excel


So, this in not properly a C# article.
I needed to write a function in Excel that create a slug (or a premalink) out of a title.
Permalink is a valid url created out of a piece of text (usually a blog post title).
The operation are quite simple: all lowercase, replace spaces and non alphabet characters with -, convert accents to plain (eg à becames a).
So in excel just VBA works as standard, so I take the slugify funcion in Orchard CMS open source that is C# and I converted in VBA.
Pretty easy? No.
Anyway I did it...

The original code follows:

        private string Slugify(FillSlugContext slugContext) {
            _slugEventHandler.FillingSlugFromTitle(slugContext);

            if (!slugContext.Adjusted) {

                var disallowed = new Regex(@"[/:?#\[\]@!$&'()*+,.;=\s\""\<\>\\\|%]+");

                var cleanedSlug = disallowed.Replace(slugContext.Title, "-").Trim('-','.');

                slugContext.Slug = Regex.Replace(cleanedSlug, @"\-{2,}", "-");

                if (slugContext.Slug.Length > 1000)
                    slugContext.Slug = slugContext.Slug.Substring(0, 1000).Trim('-', '.');

                slugContext.Slug = StringExtensions.RemoveDiacritics(slugContext.Slug.ToLower());
            }
            
            _slugEventHandler.FilledSlugFromTitle(slugContext);

            return slugContext.Slug;
        }

And this is the VBA result...


Function CreateSlug(text As String) As String
    
    Dim cleanedSlug As String
    cleanedSlug = text
    
    'Regex that replaces any invalid char with an -
    Set regEx = New RegExp
    regEx.Pattern = "[:?#\[\]@!$&'()*+,.;=\s\""\<\>\\\|%]+"
    regEx.IgnoreCase = True

    'The loop is needed because in vba it matches just first char in string
    Dim tm As String
    Do
        tm = cleanedSlug
        cleanedSlug = regEx.Replace(cleanedSlug, "-")
    Loop While (tm <> cleanedSlug)
    
    'removes all trailing - left    
    cleanedSlug = Trim(cleanedSlug, "-")

    'if there are 2 - replace with 1    
    regEx.Pattern = "\-{2,}"
    cleanedSlug = regEx.Replace(cleanedSlug, "-")

    'all lowercase
    cleanedSlug = LCase(cleanedSlug)

    'remove accents
    cleanedSlug = ReplaceAccents(cleanedSlug)

    'but why Vba designers removed the return keyword?
    CreateSlug = cleanedSlug
    
    Set regEx = Nothing
End Function
This helper function mimics the trim specific characters on the end of a string that does not exists in vba
Function Trim(text As String, trailing As String)

    If (Right(text, Len(trailing)) = trailing) Then
        Trim = Left(text, Len(text) - Len(trailing))
    Else
        Trim = text
    End If

End Function
This helper function remove specific accents in a string replacing with normal cheracters
Function ReplaceAccents(text As String)
    Const AccChars As String = "àáâãäåçèéêëìíîïðñòóôõöùúûüýÿø"
    Const RegChars As String = "aaaaaaceeeeiiiidnooooouuuuyyo"

    For J = 1 To Len(AccChars)
        text = Replace(text, Mid(AccChars, J, 1), Mid(RegChars, J, 1), compare:=vbBinaryCompare)
    Next J

    ReplaceAccents = text
End Function

Submit this story to DotNetKicks

No comments:

Post a Comment