Home |
Search |
Today's Posts |
![]() |
|
alt.talk.weather (General Weather Talk) (alt.talk.weather) A general forum for discussion of the weather. |
Reply |
|
LinkBack | Thread Tools | Display Modes |
#1
![]() |
|||
|
|||
![]()
Non-printing characters.
Turn on 'regular expression' by selecting the 'Regular expressions' option in the: Edit 'Find & Replace' dialogue; this allows you to search for 'wildcards'. Wildcards are similar to the wildcards you might use in specifying a range of file names (for example, 'C:\My Documents\text*.*' or /home/ me/text*') No text you're searching for can span a paragraph boundary but can be found in any number of paragraphs.. To search for Tabs and Paragraph endings open the Find dialogue and select the 'Regular expressions' option and enter: \t To find more than one tab character, just enter '\t' more than once: \t\t\t ...searches for 3 tabs in a row. And you can, of course, search for terms such as this one: Date\tTime\tName ....which searches for any occurrence of the word 'Date', followed by a tab, followed by the word 'Time', followed by another tab, followed by the word 'Name'. The 'end-of-line' character: $ If you want to find a paragraph that ends with an example of text then you'd enter the word or words you want to find followed by the dollar sign e.g: rhubarb rhubarb full stop or whatever$ ....in the Search box. You can find the next paragraph that ends with this text simply by clicking 'Find' (or pressing the Enter key) again. You can only use regular expressions containing the end-of-line character ('$') within a single paragraph. You can't expect to cross the line with terms such as: rhubarb rhubarb.$Rhubarb blabber.. ....to work, you'll just get the message “Search key not found” A 'row break', also sometimes known as a 'soft line break' is the one you generate when you press Shift+Enter. It puts the text on the next line without starting a new paragraph. To search for a "row break" enter: \n ...for 'newline' in the Search box. And, of course, searching for: now\nit ....would find the row break if you remember to select the 'Regular expression' option. Use the ^ ....character to search for the start of a line. It won't work if you enter it in the Search box on its own. You need to follow it with something, and the simplest thing to follow it with is a full stop sign: . ....which means 'match any single character'. So ^. ....means search for the single character at the beginning of the next paragraph. All of the regular expression characters above ('\t', '$', '\n', '^', '.') can be used in a search term to match a single character. The '.' character is a 'wild card' character. To delete unnecessary empty paragraphs, open the Find dialogue, enter: ^$ .....the code for the start of a paragraph immediately followed by the code for the end of a paragraph, select 'Regular expression' option, and click 'Replace all'. Searching for patterns. '^the ....means:: match the word 'the' only where it appears at the start of a paragraph. Note this will also match the words 'The', 'tHe', 'thE', 'ThE', or 'THE'. You need to select the 'Match case' option to ignore 'ThE' but find 'the'. Wildcards. .he ....will match both 'the' and 'she'. It will also match any letter followed by 'he' if those letters appear in the middle of a longer word such as 'lathes' or 'lashes'. Repeating the '.' in the RE requires each '.' to match one character. You can match any run of zero or more of the same character with: * ..So 'ab*c' will match 'ac', 'abc', 'abbc', 'abbbc', and so on. The wildcard characters '.' and '*' can be used anywhere within the RE. You could, for example, search for 'cat.' or 'c*t' or '.og.*', and you'd match 'cats', 'catamount' and 'dogstar' if they appeared in your text. '+' means “match one or more of the previous character”. So while 'as*' will match 'a', 'as', 'ass' and 'asss', 'as+' will fail to match 'a' but will match 'as', 'ass', 'asss', etc. '?' will match 'zero or one' of the previous character. So 'as?' will match 'a' or 'as' and nothing else. Meaning, “match a single 'a' followed by an optional 's'”. Character sets. If you don't want to keep selecting and deselecting the 'Match case' option, use 'character sets'. A number of characters inside square brackets. Search for 'the' variants with [Tt][Hh][Ee] [Tt] ...means “match either a 'T' or a 't' but nothing else. You could search for any vowel with '[aeiou]' and it would match just one vowel wherever it appeared. So you could search for any word that has more than one consecutive vowel with '[aeiou][aeiou]+' which means “match any vowel followed by at least one other vowel”. Just as '[Tt]' means “match either a 'T' or a 't', '[bcd]og' means “match the letters 'og' wherever they have a 'b', a 'c', or a 'd' in front of them. Instead of: [bcdefg]og] ...you can use: [b-g]og And you can combine sets, too: [b-dn-w]oggle ....would match 'boggle', 'coggle', 'doggle', 'noggle', 'roggle', and 'woggle' etc. Suppose you wanted to find all of the three-letter words that ended with 'og' but didn't start with 'b', 'c', or 'd'. The: ^ ...character, inside square brackets means “negate the following set”. In other words: [^b-d]og ...will match words like 'log' and 'fog', but not 'bog', 'cog', or 'dog'. So '^' has two meanings: one inside a character set and another outside. Odds and ends. A character set gives us a way of searching for alternate characters within our text. But suppose we want to search for alternate words? To search for 'words' or 'text' use: words|text The '|' means 'or'. When you get to the expert stage you'll be mixing and matching search terms and using such examples as '[wt]hen|(where|how)ever'. Note the use of parentheses to group elements together. Put it together piece by piece and you'd know what you intended. The example means “match if you find 'when' or 'then', OR if you find either 'wherever' or 'however'. Note that this example wouldn't match 'whenever', but this example would: '(when|where|how)ever'. And this could also be written, at the cost of legibility, to '(wh(en|ere)|how)ever'. While '^' and '$' allow you to match text at the beginning and end of a paragraph, it would also be useful if you could search for text that only occurs at the beginning or end of a word, ignoring any space and/ or punctuation. Of course, you can: '\' matches the beginning of a word and '\' matches the end. So if you wanted to search for words that began with 'the' you could use '\the'. Note that this will find 'the', 'there', 'them', and so on, but wouldn't not match 'out-there' or the 'the' in 'tithe'. So instead of using the Find dialog's 'Whole words only' option, you can search for '\[st]he\' and know that it will only match the words 'the' or 'she' and nothing else. http://homepage.ntlworld.com/garrykn.../ooregexp.html |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Forum | |||
Regular IBUKI (GOSAT) Data Comming Next Year. | sci.geo.meteorology (Meteorology) | |||
Open Office "Regular Expressions" | uk.sci.weather (UK Weather) | |||
Earlier than regular Sunrise | alt.binaries.pictures.weather (Weather Photos) | |||
Doppler radar units for regular folks - Available? | alt.talk.weather (General Weather Talk) | |||
Shipping forecast book - mention for USW regular on Radio 2 | uk.sci.weather (UK Weather) |