There are many times I wish I could strangle (figuratively) one or more developers at Microsoft. The choices they make, especially when it comes to developing utilities/macros for their Office suite (though there are no shortage of issues when it comes to just using Office) are mind-boggling. I have to imagine that, over the course of 39 years in development, the sheer quantity of code for the product (Word) has become absurd, but some decisions are...less than ideal.

Say I want to alter text with a macro. I have

The quick brown fox jumps over the lazy dog.

where the underlined text denotes a field. Let's call that field "fox adjectives", so if you press Alt+F9 in your document you would see

The { DOCPROPERTY  "fox adjectives"  \* MERGEFORMAT } fox jumps over the lazy dog.

Function CountChars()
    CountChars = Selection.Range.End - Selection.Range.Start
End Function

So far, so standard. But let's say you write a macro to tell you how long the text is, when you highlight a portion? Say, highlight "The quick brown fox". You would expect your character-counting macro to say, perhaps, 19. Or even 57. The correct answer? 69. Both the field code, some invisible extra control character, and the field value, are stored and counted when using the Range object.

Now, I can get the length of the selected text by instead doing

Len(Selection.Range.Text)

That provides 19, (or 57 if the field codes are displayed) but there's a hitch. How often do you want to count characters in a macro? What if, instead, you want to change the word "lazy" to "hyperactive"?

One way is to do the following:

Selection.Find.Execute FindText := "lazy", Forward := True
Selection.Range.Text = "hyperactive"

That's all fine and dandy (unless you're using C# and have to specify all 15 arguments of the Execute method), but again, we're looking at a fairly simplistic case. Say I want to find something based on a regular expression. This isn't a built-in feature of Word, but throw in some code and you can do it. The problem comes when you try to highlight it if it comes after a field.

We have to leave VBA behind, because the computer I'm working with for some reason doesn't include Microsoft VBScript Regular Expressions 5.5 (only 1.0)—but that doesn't bother me much.

Here's the relevant code in C#:

foreach (Paragraph paragraph in paragraphs)
{
    Match match = Regex.Match(paragraph.Range.Text, @"\blazy\b");
    if (match.Success)
    {
        Range range = paragraph.Range;
        range.Start = paragraph.Range.Start + match.Index;
        range.End = paragraph.Range.Start + match.Index + match.Length;
        range.Select();
    }
}

Someone using the word "glazy" might be unlikely, but this is an example, so we'll roll with it.

Word says, "Okay, select a range from character 35 to character 39. I can do that!"

I beg to differ. Since it's selecting a range within the field, as the field exists as characters 4–65, we have to adjust this. But how? We need to deal with all fields before our match index, which, unfortunately, may be more than one. Say "fox" was a field as well: then even though the indices select a range within the first field, we have to deal with two fields.

Here are some options:

Option 1

Edit the code to the following:

foreach (Paragraph paragraph in paragraphs)
{
    Match match = Regex.Match(paragraph.Range.Text, @"\blazy\b");
    if (match.Success)
    {
        Range range = paragraph.Range;
        range.Start = paragraph.Range.Start + match.Index;
        range.End = paragraph.Range.Start + match.Index + match.Length;
        while (range.Text != match.Groups[0].Value && range.End < paragraph.Range.End) 
        {
            range.Start += 1;
            range.End += 1;
        }
        range.Select();
    }
}

This works for simple regular expressions, but if there are lookaheads, lookbehinds, or zero-width characters involved, it could cause issues. It might be okay, but it might not.

Option 2

Edit the code to the following:

foreach (Paragraph paragraph in paragraphs)
{
    Match match = Regex.Match(paragraph.Range.Text, @"\blazy\b");
    if (match.Success)
    {
        Range range = paragraph.Range;
        range.Find.Execute(match.Groups[0].Value, missing, missing, missing, missing, missing, true, WdFindWrap.wdFindStop, missing, missing, missing, missing, missing, missing, missing);
        range.Select();
    }
}

This works for simple regular expressions, but if there are lookaheads, lookbehinds, or zero-width characters involved, it could cause issues. It might be okay, but it might not.

Option 3

Edit the code to the following:

foreach (Paragraph paragraph in paragraphs)
{
    Match match = Regex.Match(paragraph.Range.Text, @"\blazy\b");
    if (match.Success)
    {
        Range range = paragraph.Range;
        range.Start = paragraph.Range.Start + match.Index;
        range.End = paragraph.Range.Start + match.Index + match.Length;
        int i = 1;
        while (range.Text != match.Groups[0].Value
               && i <= paragraph.Range.Fields.Count)
        {
            Field field = paragraph.Range.Fields[i];
            int adjustment = field.ShowCodes ? field.Result.Text.Length + 1 : field.Code.Text.Length + 3;
            range.End += adjustment;
            range.Start += adjustment;
            i++;
        }
        range.Select();
    }
}

This works for more complex regular expressions, including those with lookaheads, lookbehinds, and zero-width characters, but a field could conceivably exist where this still matches the wrong segment of text. We need to make sure that we're not matching anything inside a field.

foreach (Paragraph paragraph in paragraphs)
{
    Match match = Regex.Match(paragraph.Range.Text, @"\blazy\b");
    if (match.Success)
    {
        Range range = paragraph.Range;
        range.Start = paragraph.Range.Start + match.Index;
        range.End = paragraph.Range.Start + match.Index + match.Length;
        int i = 1;
        while (range.Text != match.Groups[0].Value
               && i <= paragraph.Range.Fields.Count
               || range.Information[WdInformation.wdInFieldCode]
               || range.Information[WdInformation.wdInFieldResult])
        {
            Field field = paragraph.Range.Fields[i];
            int adjustment = field.ShowCodes ? field.Result.Text.Length + 1 : field.Code.Text.Length + 3;
            range.End += adjustment;
            range.Start += adjustment;
            i++;
        }
        range.Select();
    }
}

Huzzah! Now change range.Select() to whatever you want to do and you're golden!

I recommend making a function with the signature

Range GetRange(Range range, Regex regex)