About Us

Weatherlawyer

Non-printing characters.

Turn on 'regular expression' by selecting the 'Regular expressions'
option in the:

Edit 'Find & Replace' dialogue; this allows you to search for
'wildcards'.

Wildcards are similar to the wildcards you might use in specifying a
range of file names (for example, 'C:\My Documents\text*.*' or /home/
me/text*')

No text you're searching for can span a paragraph boundary but can be
found in any number of paragraphs..

To search for Tabs and Paragraph endings open the Find dialogue and
select the 'Regular expressions' option and enter:

\t

To find more than one tab character, just enter '\t' more than once:

\t\t\t

...searches for 3 tabs in a row. And you can, of course, search for
terms such as this one:

Date\tTime\tName

....which searches for any occurrence of the word 'Date', followed by
a tab, followed by the word 'Time', followed by another tab, followed
by the word 'Name'.

The 'end-of-line' character:

$

If you want to find a paragraph that ends with an example of text then
you'd enter the word or words you want to find followed by the dollar
sign e.g:

rhubarb rhubarb full stop or whatever$

....in the Search box.
You can find the next paragraph that ends with this text simply by
clicking 'Find' (or pressing the Enter key) again.

You can only use regular expressions containing the end-of-line
character ('$') within a single paragraph. You can't expect to cross
the line with terms such as:

rhubarb rhubarb.$Rhubarb blabber..

....to work, you'll just get the message “Search key not found”

A 'row break', also sometimes known as a 'soft line break' is the one
you generate when you press Shift+Enter. It puts the text on the next
line without starting a new paragraph. To search for a "row break"
enter:

\n

...for 'newline' in the Search box. And, of course, searching for:

now\nit

....would find the row break if you remember to select the 'Regular
expression' option.

Use the

^

....character to search for the start of a line. It won't work if you
enter it in the Search box on its own. You need to follow it with
something, and the simplest thing to follow it with is a full stop
sign:

.

....which means 'match any single character'. So

^.

....means search for the single character at the beginning of the next
paragraph.

All of the regular expression characters above ('\t', '$', '\n', '^',
'.') can be used in a search term to match a single character. The '.'
character is a 'wild card' character.

To delete unnecessary empty paragraphs, open the Find dialogue, enter:

^$

.....the code for the start of a paragraph immediately followed by the
code for the end of a paragraph, select 'Regular expression' option,
and click 'Replace all'.

Searching for patterns.

'^the

....means:: match the word 'the' only where it appears at the start of
a paragraph. Note this will also match the words 'The', 'tHe', 'thE',
'ThE', or 'THE'.

You need to select the 'Match case' option to ignore 'ThE' but find
'the'.

Wildcards.

.he

....will match both 'the' and 'she'. It will also match any letter
followed by 'he' if those letters appear in the middle of a longer
word such as 'lathes' or 'lashes'. Repeating the '.' in the RE
requires each '.' to match one character.

You can match any run of zero or more of the same character with:

*

..So 'ab*c' will match 'ac', 'abc', 'abbc', 'abbbc', and so on.

The wildcard characters '.' and '*' can be used anywhere within the
RE. You could, for example, search for 'cat.' or 'c*t' or '.og.*', and
you'd match 'cats', 'catamount' and 'dogstar' if they appeared in your
text.

'+' means “match one or more of the previous character”. So while
'as*' will match 'a', 'as', 'ass' and 'asss',
'as+' will fail to match 'a' but will match 'as', 'ass', 'asss', etc.

'?' will match 'zero or one' of the previous character. So 'as?' will
match 'a' or 'as' and nothing else. Meaning, “match a single 'a'
followed by an optional 's'”.

Character sets.

If you don't want to keep selecting and deselecting the 'Match case'
option, use 'character sets'. A number of characters inside square
brackets.
Search for 'the' variants with

[Tt][Hh][Ee]

[Tt]

...means “match either a 'T' or a 't' but nothing else. You could
search for any vowel with '[aeiou]' and it would match just one vowel
wherever it appeared. So you could search for any word that has more
than one consecutive vowel with '[aeiou][aeiou]+' which means “match
any vowel followed by at least one other vowel”.

Just as '[Tt]' means “match either a 'T' or a 't', '[bcd]og' means
“match the letters 'og' wherever they have a 'b', a 'c', or a 'd' in
front of them. Instead of:

[bcdefg]og]

...you can use:

[b-g]og

And you can combine sets, too:

[b-dn-w]oggle

....would match 'boggle', 'coggle', 'doggle', 'noggle', 'roggle', and
'woggle' etc.

Suppose you wanted to find all of the three-letter words that ended
with 'og' but didn't start with 'b', 'c', or 'd'. The:

^

...character, inside square brackets means “negate the following set”.
In other words:

[^b-d]og

...will match words like 'log' and 'fog', but not 'bog', 'cog', or
'dog'. So '^' has two meanings: one inside a character set and another
outside.

Odds and ends.

A character set gives us a way of searching for alternate characters
within our text. But suppose we want to search for alternate words? To
search for 'words' or 'text' use:

words|text

The '|' means 'or'.

When you get to the expert stage you'll be mixing and matching search
terms and using such examples as '[wt]hen|(where|how)ever'. Note the
use of parentheses to group elements together. Put it together piece
by piece and you'd know what you intended. The example means “match if
you find 'when' or 'then', OR if you find either 'wherever' or
'however'. Note that this example wouldn't match 'whenever', but this
example would: '(when|where|how)ever'. And this could also be written,
at the cost of legibility, to '(wh(en|ere)|how)ever'.

While '^' and '$' allow you to match text at the beginning and end of
a paragraph, it would also be useful if you could search for text that
only occurs at the beginning or end of a word, ignoring any space and/
or punctuation. Of course, you can: '\' matches the beginning of a
word and '\' matches the end. So if you wanted to search for words
that began with 'the' you could use '\the'. Note that this will find
'the', 'there', 'them', and so on, but wouldn't not match 'out-there'
or the 'the' in 'tithe'. So instead of using the Find dialog's 'Whole
words only' option, you can search for '\[st]he\' and know that it
will only match the words 'the' or 'she' and nothing else.

http://homepage.ntlworld.com/garrykn.../ooregexp.html

Thread Tools	Search this Thread
Show Printable Version	Search this Thread: Advanced Search
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Regular IBUKI (GOSAT) Data Comming Next Year.	Roger Coppock	sci.geo.meteorology (Meteorology)	1	December 31st 09 02:44 PM
Open Office "Regular Expressions"	Weatherlawyer	uk.sci.weather (UK Weather)	4	January 1st 08 11:36 PM
Earlier than regular Sunrise	Edward Erbeck	alt.binaries.pictures.weather (Weather Photos)	10	August 11th 07 09:04 PM
Doppler radar units for regular folks - Available?		alt.talk.weather (General Weather Talk)	2	October 21st 04 02:09 PM
Shipping forecast book - mention for USW regular on Radio 2	JPG	uk.sci.weather (UK Weather)	0	May 25th 04 11:40 AM

Menu

About Us