Ask Geoff About Excel (or Google Sheets)

Health insurance rip off lying FDA big bankers buying
Fake computer crashes dining
Cloning while they're multiplying
Fashion shoots with Beck and Hanson
Courtney Love, and Marilyn Manson
You're all fakes
Run to your mansions
Come around
We'll kick your ass in

Postby doublethink0 » Mon May 14, 2018 1:23 pm

Geoff, Excel Witch and Magician ®️
Image
User avatar
doublethink0
 
Posts: 1685
Joined: Sat Sep 05, 2015 3:17 pm
Location: CLE

Postby Geoff » Mon May 14, 2018 1:32 pm

Geoff wrote:
Jsn wrote:Can you format 99.99 so that the cents displays as superscript


yes it a round about way, way complext than it should be.

here's how I did it in excel (for PC), it uses special superscript characters and lookups and various left/mid formulas.

Image

Let me know if you want me to send you the spreadsheet.


just be aware the format of the cell after converting to superscript for cents, will be text, and therefore you'll not be add them / multiple them. You'll need to use the original numbers if you want to do some calculation.
User avatar
Geoff
i like jazz
 
Posts: 5678
Joined: Sat Dec 26, 2009 8:33 am

Postby gustavo » Mon May 14, 2018 2:04 pm

I've really, really enjoyed your solution to this, thanks Geoff.
gustavo
 
Posts: 92
Joined: Tue Jul 11, 2017 3:42 pm

Postby Jsn » Mon May 14, 2018 3:35 pm

Geoff wrote:
Geoff wrote:
Jsn wrote:Can you format 99.99 so that the cents displays as superscript


yes it a round about way, way complext than it should be.

here's how I did it in excel (for PC), it uses special superscript characters and lookups and various left/mid formulas.

Image

Let me know if you want me to send you the spreadsheet.


just be aware the format of the cell after converting to superscript for cents, will be text, and therefore you'll not be add them / multiple them. You'll need to use the original numbers if you want to do some calculation.



Yes, could you please?
User avatar
Jsn
 
Posts: 462
Joined: Thu Dec 07, 2017 11:15 pm
Location: Sherman Oaks

Postby Geoff » Mon May 14, 2018 3:48 pm

here's a link to the file, only valid for 24 hours please let me know if you successfully download it.

https://uploadfiles.io/tsmbc
User avatar
Geoff
i like jazz
 
Posts: 5678
Joined: Sat Dec 26, 2009 8:33 am

Postby Geoff » Mon May 14, 2018 4:15 pm

For a plain english explanation of the formula please see below (for example 1).

Part 1
Formula: IF(ISERROR(FIND(".",G4,1)),
Action: Check to see if there is a decimal place in the cell referenced
Note: A round number in number formated cell will not have a decimal place despite even if you force the format to show it


Part 2
Formula: G4&" ⁰⁰",
Action: if there is no decimal place, take the whole number and add the 00 as superscript
Note:Without this ERROR check, the formula would not have worked for whole numbers


Part 3
Formula: LEFT(G5,FIND(".",G5,1)-1)
Action: From the left of cell, take the number returned based on the number position of the decimal place, minus 1 to remove the decimal place itself
Note:Find formula finds the position of the selected character, but if you want to exclude that in a LEFT formula remember to minus 1


Part 4
Formula: &VLOOKUP(MID(G4,FIND(".",G4,1)+1,1),$B:$C,2,0)
Action: Lookup the 1st decimal place against the look up table, and return the superscript special character for this number
Note:Extracting using a MID formula, converts a numeric cell into text, this is why I had to force the lookup table to be text rather than numeric format


Part 5
Formula: VLOOKUP(MID(G4,FIND(".",G4,1)+2,1),$B:$C,2,0))
Action: Lookup the 2nd decimal place against the look up table, and return the superscript special character for this number
Note:When you break it down it's pretty simple, I might if you ask nicely improve the formula to add a comma if the value is over 999.99, so let me know if you need it, I could also add a dollar sign as well if that helps
User avatar
Geoff
i like jazz
 
Posts: 5678
Joined: Sat Dec 26, 2009 8:33 am

Postby fakename » Mon May 14, 2018 4:49 pm

Is there a range variable I can use to specify the last non-empty cell in column?
For example, I use this formula:
=IF(COUNTIF(A3:A488,"<>D/A")>0,"show","hide")
Every time I use it, I have to look up how many rows are used, and update that number (488 in this case).

Is there something I could put here that would remove the need to manually count and change the number?
User avatar
fakename
 
Posts: 3864
Joined: Mon Dec 14, 2009 8:27 pm

Postby Geoff » Mon May 14, 2018 5:25 pm

If you are ok to just look up the whole column excluding the header(s) you could just do this, looks up entire column excluding headers.

=IF(COUNTIF(A3:INDEX(A:A,ROWS(A:A)),"D/A"),"hide","show")

I think this is all you need, it looks up the complete range of the column excluding headers, similar to using COUNTIF(A3:A,"D/A") in Google Spreadsheets

If you really reallly need to look up to only the last non-blank cell it's more complicated as it would be this

=IFERROR(IF(COUNTIF(A3:INDIRECT(SUBSTITUTE(ADDRESS(1,COLUMN(A:A),4),1,"")&SUMPRODUCT(MAX((A3:INDEX(A:A,ROWS(A:A))<>"")*ROW(A3:INDEX(A:A,ROWS(A:A)))))),"D/A"),"hide","show"),"")

The second option is very inefficient and actually managed to overload my excel by forcing it to calculate too much, so do use the first one, the second was more proof of conecept haha.
User avatar
Geoff
i like jazz
 
Posts: 5678
Joined: Sat Dec 26, 2009 8:33 am

Postby fakename » Mon May 14, 2018 5:48 pm

The formula I posted looks to see if any of the populated cells do not equal "D/A"
Both of those formulas included blank space after the last populated cell, which causes the formula to return a false positive.

It looks like =COUNTA(A:A) returns the number I'm looking for. Is it possible to embed that into my formula?

e.g.
=IF(COUNTIF(A3:A(COUNTA(A:A)),"<>D/A")>0,"show","hide")
User avatar
fakename
 
Posts: 3864
Joined: Mon Dec 14, 2009 8:27 pm

Postby Geoff » Mon May 14, 2018 5:52 pm

fakename wrote:The formula I posted looks to see if any of the populated cells do not equal "D/A"
Both of those formulas included blank space after the last populated cell, which causes the formula to return a false positive.

It looks like =COUNTA(A:A) returns the number I'm looking for. Is it possible to embed that into my formula?

e.g.
=IF(COUNTIF(A3:A(COUNTA(A:A)),"<>D/A")>0,"show","hide")


I'm confused a little here, the second formula will show the last row number which might be useful, I got to go sleep now, but will provide a final solution tomorrow.

Why is going to the last populated cell so important? how will that be more efficent than just searching if there is a D/A in the column below the headers? Sorry for my stupidity here :(
User avatar
Geoff
i like jazz
 
Posts: 5678
Joined: Sat Dec 26, 2009 8:33 am

Postby Arturo » Mon May 14, 2018 6:07 pm

Hi Geoff,

How do I copy and paste over formulas as values when there is a filter on the column?

I've read up on how you should be able to do it as per the below with only visible cells but it doesn't work :cry:

Image

Image
User avatar
Arturo
 
Posts: 2651
Joined: Mon Dec 14, 2009 5:39 pm
Location: Manchester

Postby Geoff » Mon May 14, 2018 7:53 pm

Try regular copy and paste
User avatar
Geoff
i like jazz
 
Posts: 5678
Joined: Sat Dec 26, 2009 8:33 am

Postby Arturo » Tue May 15, 2018 4:43 am

Doesn't like it. It just says "That command cannot be used on multiple selections".
User avatar
Arturo
 
Posts: 2651
Joined: Mon Dec 14, 2009 5:39 pm
Location: Manchester

Postby Geoff » Tue May 15, 2018 5:00 am

send me your test spreadsheet, what version of excel are you using?
User avatar
Geoff
i like jazz
 
Posts: 5678
Joined: Sat Dec 26, 2009 8:33 am

Postby Geoff » Tue May 15, 2018 9:07 am

woah google sheets now has a record marco function!
User avatar
Geoff
i like jazz
 
Posts: 5678
Joined: Sat Dec 26, 2009 8:33 am

Postby Geoff » Tue May 15, 2018 5:52 pm

JSN did youi manage to download the file before it expired?
User avatar
Geoff
i like jazz
 
Posts: 5678
Joined: Sat Dec 26, 2009 8:33 am

Postby Arturo » Tue May 15, 2018 5:53 pm

Geoff wrote:send me your test spreadsheet, what version of excel are you using?

Just saw this will send tomorrow thanks for assistance! This pisses me off at work every day
User avatar
Arturo
 
Posts: 2651
Joined: Mon Dec 14, 2009 5:39 pm
Location: Manchester

Postby Geoff » Wed May 16, 2018 8:28 pm

bump
User avatar
Geoff
i like jazz
 
Posts: 5678
Joined: Sat Dec 26, 2009 8:33 am

Postby Jsn » Thu May 17, 2018 9:20 pm

I told ppl at work. I will relaunch project soon. I didn’t dl it. I am amazed all of that being said. You rock. Thank you.
User avatar
Jsn
 
Posts: 462
Joined: Thu Dec 07, 2017 11:15 pm
Location: Sherman Oaks

Postby Geoff » Fri May 18, 2018 5:44 pm

do you want me to reupload it? or do you not need it now jsn
User avatar
Geoff
i like jazz
 
Posts: 5678
Joined: Sat Dec 26, 2009 8:33 am

Postby wintergreen » Fri May 18, 2018 5:53 pm

come on, jsn, take the spreadsheet.
User avatar
wintergreen
 
Posts: 9795
Joined: Mon Dec 14, 2009 2:57 pm

Postby Geoff » Fri May 18, 2018 5:54 pm

put all that effort in for a super complicated solution that I doubt any one else has ever done and this is qwhat I get :roll:

just kidding :awful:
User avatar
Geoff
i like jazz
 
Posts: 5678
Joined: Sat Dec 26, 2009 8:33 am

Postby Geoff » Fri May 18, 2018 5:58 pm

Arturo and fake name it's friday night now and i have the whole weekend to solve your problems, let me know if you still need help.
User avatar
Geoff
i like jazz
 
Posts: 5678
Joined: Sat Dec 26, 2009 8:33 am

Postby banquo » Fri May 18, 2018 6:41 pm

I have a few new excel questions that came up while working in some spreadsheets this week -- none of these are actually crucial or time sensitive:

1) If i'm building a spreadsheet in sheet1 that pulls row data from sheet2, but the amount of non-pulled extra rows i use per section is different, is there a certain formula i should be using to make this more automatic when copy/pasting those parts of each section? I did this manually & find/replaced the wrong row #s to make it work, but i thought there's probably a better way. (example: rows 4-12 will be using pulled data from sheet 2, but rows 13-23 may be used for commentary or other purposes. I will start a new section on Row 25, but it might have 20 or 30 extra rows for the commentary/other purposes)

2) If I have a cell with multiple items separated by commas (a comma and 1 space), and i want to separate them out, how do i not include those spaces? when i did this manually it put the spaces before each word in the new cells. Some of the items were multiple words, so I couldn't just remove all spaces. Is there a simple setting i missed that would remove the spaces?

3) the purpose of separating out #2 is to find duplicates among lots of multi-item lists, and i was using =if(countif($range:$range,cell)>1, "duplicate","") -- is there a better duplicate formula you'd suggest? (also, see #4)

4) when looking for duplicates, how should I write a formula if i'm trying to find singular/plural "duplicates" -- ie, i'd want to identify if both "pear" and "pears", or "red apple" and "red apples", or "makes time" and "make time" appeared in the full list without manually reviewing - how could i improve the formula from #3 for this?
User avatar
banquo
 
Posts: 3316
Joined: Tue Dec 15, 2009 8:28 pm

Postby draw » Fri May 18, 2018 10:48 pm

Would love to see Geoff whip up an Excel formulas only implementation of Porter stemming for #4 lol
User avatar
draw
 
Posts: 1992
Joined: Fri Oct 20, 2017 1:49 pm

Postby Jsn » Sun May 20, 2018 10:09 pm

Im very appreciative. Sorry if that didn’t come across. If you reupload i will need to check with IT to cma before downloading but i think its worth it so yes please reupload
User avatar
Jsn
 
Posts: 462
Joined: Thu Dec 07, 2017 11:15 pm
Location: Sherman Oaks

Postby Geoff » Thu May 24, 2018 5:12 pm

banquo wrote:I have a few new excel questions that came up while working in some spreadsheets this week -- none of these are actually crucial or time sensitive:

1) If i'm building a spreadsheet in sheet1 that pulls row data from sheet2, but the amount of non-pulled extra rows i use per section is different, is there a certain formula i should be using to make this more automatic when copy/pasting those parts of each section? I did this manually & find/replaced the wrong row #s to make it work, but i thought there's probably a better way. (example: rows 4-12 will be using pulled data from sheet 2, but rows 13-23 may be used for commentary or other purposes. I will start a new section on Row 25, but it might have 20 or 30 extra rows for the commentary/other purposes)

2) If I have a cell with multiple items separated by commas (a comma and 1 space), and i want to separate them out, how do i not include those spaces? when i did this manually it put the spaces before each word in the new cells. Some of the items were multiple words, so I couldn't just remove all spaces. Is there a simple setting i missed that would remove the spaces?

3) the purpose of separating out #2 is to find duplicates among lots of multi-item lists, and i was using =if(countif($range:$range,cell)>1, "duplicate","") -- is there a better duplicate formula you'd suggest? (also, see #4)

4) when looking for duplicates, how should I write a formula if i'm trying to find singular/plural "duplicates" -- ie, i'd want to identify if both "pear" and "pears", or "red apple" and "red apples", or "makes time" and "make time" appeared in the full list without manually reviewing - how could i improve the formula from #3 for this?


I kind of need to see an example of 1) to understand what you mean, I can quite understand what you need (my apologies)

2) =could just find and replace ", " with "," or with a formula =SUBSTITUTE(N8,", ",",")

3) I mean it is possible to do the find duplicate without spliting cells by using this script (ps it's not mine I'm not that clever, I think it only works on Excel for Windows let me know if you use windows

Code: Select all
Function RemoveDupes2(txt As String, Optional delim As String = " ") As String
    Dim x
    'Updateby20140924
    With CreateObject("Scripting.Dictionary")
        .CompareMode = vbTextCompare
        For Each x In Split(txt, delim)
            If Trim(x) <> "" And Not .exists(Trim(x)) Then .Add Trim(x), Nothing
        Next
        If .Count > 0 Then RemoveDupes2 = Join(.keys, delim)
    End With
End Function


Then use the formula =RemoveDupes2(SUBSTITUTE(celladdress,", ",","),","),", ")

This will remove the space in the deliminter, and then remove duplicates.

4) as draw said the only real way to properly do this is using a method called Porter stemming, which is insanely complicated.

If you wanted to do a very dirty and very very very bad version (which I very much don't suggest), you could just remove all s, and than apply the remove duplicates formula that i linked to.

so it would be =RemoveDupes2(SUBSTITUTE(SUBSTITUTE(celladdress,"s",""),", ",","),",").

If you really really want to do porter stemming in excel, I did manage to dig out a VBA based porter stemmer (of course I didn't write), you'll need to split the cell after using the removedupes2 as suggested in 3) (NOT 4)

And then apply the porter stemmer formula to each sell, and then filter the duplicates again using the countif thing you were using

Here the porter steemer vba, as you can see complicated

Code: Select all
'Porter Stemmer in VISUAL BASIC 6. It follow the algorithm definition
'presented in :
'   Porter, 1980, An algorithm for suffix stripping, Program, Vol. 14,
'   no. 3, pp 130-137,
'   (http://www.tartarus.org/~martin/PorterStemmer)

'Author : Navonil Mustafee
'Brunel University - student
'Algorithm Implemented as part for assignment on document visualization

'TO USE THE PROGRAM CALL THE FUNCTION PORTERALGORITHM. THE WORD
'TO BE STEMMED SHOULD BE PASSED AS THE ARGUEMENT ARGUEMENT. THE STRING
'RETURNED BY THE FUNCTION IS THE STEMMED WORD

Option Explicit

Public Function porterAlgorithm(str As String) As String

On Error Resume Next

'only strings greater than 2 are stemmed
If Len(Trim(str)) > 2 Then
    str = porterAlgorithmStep1(str)
    str = porterAlgorithmStep2(str)
    str = porterAlgorithmStep3(str)
    str = porterAlgorithmStep4(str)
    str = porterAlgorithmStep5(str)
End If

'End of Porter's algorithm.........returning the word
porterAlgorithm = str

End Function

Private Function porterAlgorithmStep1(str As String) As String

On Error Resume Next

'STEP 1A
'
'    SSES -> SS                         caresses  ->  caress
'    IES  -> I                          ponies    ->  poni
'                                       ties      ->  ti
'    SS   -> SS                         caress    ->  caress
'    S    ->                            cats      ->  cat


'declaring local variables
Dim i As Byte
Dim j As Byte
Dim step1a(3, 1) As String

'initializing contents of 2D array
step1a(0, 0) = "sses"
step1a(0, 1) = "ss"
step1a(1, 0) = "ies"
step1a(1, 1) = "i"
step1a(2, 0) = "ss"
step1a(2, 1) = "ss"
step1a(3, 0) = "s"
step1a(3, 1) = ""

'checking word
For i = 0 To 3 Step 1
    If porterEndsWith(str, step1a(i, 0)) Then
            str = porterTrimEnd(str, Len(step1a(i, 0)))
            str = porterAppendEnd(str, step1a(i, 1))
            Exit For
    End If
Next i


'--------------------------------------------------------------------------------------------------------

'STEP 1B
'
'   If
'       (m>0) EED -> EE                     feed      ->  feed
'                                           agreed    ->  agree
'   Else
'       (*v*) ED  ->                        plastered ->  plaster
'                                           bled      ->  bled
'       (*v*) ING ->                        motoring  ->  motor
'                                           sing      ->  sing
'
'If the second or third of the rules in Step 1b is successful, the following
'is done:
'
'    AT -> ATE                       conflat(ed)  ->  conflate
'    BL -> BLE                       troubl(ed)   ->  trouble
'    IZ -> IZE                       siz(ed)      ->  size
'    (*d and not (*L or *S or *Z))
'       -> single letter
'                                    hopp(ing)    ->  hop
'                                    tann(ed)     ->  tan
'                                    fall(ing)    ->  fall
'                                    hiss(ing)    ->  hiss
'                                    fizz(ed)     ->  fizz
'    (m=1 and *o) -> E               fail(ing)    ->  fail
'                                    fil(ing)     ->  file
'
'The rule to map to a single letter causes the removal of one of the double
'letter pair. The -E is put back on -AT, -BL and -IZ, so that the suffixes
'-ATE, -BLE and -IZE can be recognised later. This E may be removed in step
'4.

'declaring local variables
Dim m As Byte
Dim temp As String
Dim second_third_success As Boolean

'initializing contents of 2D array
second_third_success = False

'(m>0) EED -> EE..else..(*v*) ED  ->(*v*) ING  ->
If porterEndsWith(str, "eed") Then

    'counting the number of m's
    temp = porterTrimEnd(str, Len("eed"))
    m = porterCountm(temp)

    If m > 0 Then
            str = porterTrimEnd(str, Len("eed"))
            str = porterAppendEnd(str, "ee")
    End If

ElseIf porterEndsWith(str, "ed") Then
 
        'trim and check for vowel
        temp = porterTrimEnd(str, Len("ed"))

        If porterContainsVowel(temp) Then
            str = porterTrimEnd(str, Len("ed"))
            second_third_success = True
        End If
       
ElseIf porterEndsWith(str, "ing") Then

        'trim and check for vowel
        temp = porterTrimEnd(str, Len("ing"))
       
        If porterContainsVowel(temp) Then
            str = porterTrimEnd(str, Len("ing"))
            second_third_success = True
        End If

End If

'If the second or third of the rules in Step 1b is SUCCESSFUL, the following
'is done:
'
'    AT -> ATE                       conflat(ed)  ->  conflate
'    BL -> BLE                       troubl(ed)   ->  trouble
'    IZ -> IZE                       siz(ed)      ->  size
'    (*d and not (*L or *S or *Z))
'       -> single letter
'                                    hopp(ing)    ->  hop
'                                    tann(ed)     ->  tan
'                                    fall(ing)    ->  fall
'                                    hiss(ing)    ->  hiss
'                                    fizz(ed)     ->  fizz
'    (m=1 and *o) -> E               fail(ing)    ->  fail
'                                    fil(ing)     ->  file


If second_third_success = True Then             'If the second or third of the rules in Step 1b is SUCCESSFUL
       
    If porterEndsWith(str, "at") Then           'AT -> ATE
            str = porterTrimEnd(str, Len("at"))
            str = porterAppendEnd(str, "ate")
    ElseIf porterEndsWith(str, "bl") Then       'BL -> BLE
            str = porterTrimEnd(str, Len("bl"))
            str = porterAppendEnd(str, "ble")
    ElseIf porterEndsWith(str, "iz") Then       'IZ -> IZE
            str = porterTrimEnd(str, Len("iz"))
            str = porterAppendEnd(str, "ize")
    ElseIf porterEndsDoubleConsonent(str) Then  '(*d and not (*L or *S or *Z))-> single letter
            If Not (porterEndsWith(str, "l") Or porterEndsWith(str, "s") Or porterEndsWith(str, "z")) Then
                str = porterTrimEnd(str, 1)
            End If
    ElseIf porterCountm(str) = 1 Then                           '(m=1 and *o) -> E
            If porterEndsCVC(str) Then
                   str = porterAppendEnd(str, "e")
            End If
    End If
   
End If


'--------------------------------------------------------------------------------------------------------
'
'STEP 1C
'
'    (*v*) Y -> I                    happy        ->  happi
'                                    sky          ->  sky

If porterEndsWith(str, "y") Then
       
        'trim and check for vowel
        temp = porterTrimEnd(str, 1)

        If porterContainsVowel(temp) Then
            str = porterTrimEnd(str, Len("y"))
            str = porterAppendEnd(str, "i")
        End If
       
End If

'retuning the word
porterAlgorithmStep1 = str

End Function

Private Function porterAlgorithmStep2(str As String) As String

On Error Resume Next

'STEP 2
'
'    (m>0) ATIONAL ->  ATE           relational     ->  relate
'    (m>0) TIONAL  ->  TION          conditional    ->  condition
'                                    rational       ->  rational
'    (m>0) ENCI    ->  ENCE          valenci        ->  valence
'    (m>0) ANCI    ->  ANCE          hesitanci      ->  hesitance
'    (m>0) IZER    ->  IZE           digitizer      ->  digitize
'Also,
'    (m>0) BLI    ->   BLE           conformabli    ->  conformable
'
'    (m>0) ALLI    ->  AL            radicalli      ->  radical
'    (m>0) ENTLI   ->  ENT           differentli    ->  different
'    (m>0) ELI     ->  E             vileli        - >  vile
'    (m>0) OUSLI   ->  OUS           analogousli    ->  analogous
'    (m>0) IZATION ->  IZE           vietnamization ->  vietnamize
'    (m>0) ATION   ->  ATE           predication    ->  predicate
'    (m>0) ATOR    ->  ATE           operator       ->  operate
'    (m>0) ALISM   ->  AL            feudalism      ->  feudal
'    (m>0) IVENESS ->  IVE           decisiveness   ->  decisive
'    (m>0) FULNESS ->  FUL           hopefulness    ->  hopeful
'    (m>0) OUSNESS ->  OUS           callousness    ->  callous
'    (m>0) ALITI   ->  AL            formaliti      ->  formal
'    (m>0) IVITI   ->  IVE           sensitiviti    ->  sensitive
'    (m>0) BILITI  ->  BLE           sensibiliti    ->  sensible
'Also,
'    (m>0) LOGI    ->  LOG           apologi        -> apolog
'
'The test for the string S1 can be made fast by doing a program switch on
'the penultimate letter of the word being tested. This gives a fairly even
'breakdown of the possible values of the string S1. It will be seen in fact
'that the S1-strings in step 2 are presented here in the alphabetical order
'of their penultimate letter. Similar techniques may be applied in the other
'steps.

'declaring local variables
Dim step2(20, 1) As String
Dim i As Byte
Dim temp As String

'initializing contents of 2D array
step2(0, 0) = "ational"
step2(0, 1) = "ate"
step2(1, 0) = "tional"
step2(1, 1) = "tion"
step2(2, 0) = "enci"
step2(2, 1) = "ence"
step2(3, 0) = "anci"
step2(3, 1) = "ance"
step2(4, 0) = "izer"
step2(4, 1) = "ize"
step2(5, 0) = "bli"
step2(5, 1) = "ble"
step2(6, 0) = "alli"
step2(6, 1) = "al"
step2(7, 0) = "entli"
step2(7, 1) = "ent"
step2(8, 0) = "eli"
step2(8, 1) = "e"
step2(9, 0) = "ousli"
step2(9, 1) = "ous"
step2(10, 0) = "ization"
step2(10, 1) = "ize"
step2(11, 0) = "ation"
step2(11, 1) = "ate"
step2(12, 0) = "ator"
step2(12, 1) = "ate"
step2(13, 0) = "alism"
step2(13, 1) = "al"
step2(14, 0) = "iveness"
step2(14, 1) = "ive"
step2(15, 0) = "fulness"
step2(15, 1) = "ful"
step2(16, 0) = "ousness"
step2(16, 1) = "ous"
step2(17, 0) = "aliti"
step2(17, 1) = "al"
step2(18, 0) = "iviti"
step2(18, 1) = "ive"
step2(19, 0) = "biliti"
step2(19, 1) = "ble"
step2(20, 0) = "logi"
step2(20, 1) = "log"

'checking word
For i = 0 To 20 Step 1
    If porterEndsWith(str, step2(i, 0)) Then
            temp = porterTrimEnd(str, Len(step2(i, 0)))
            If porterCountm(temp) > 0 Then
                str = porterTrimEnd(str, Len(step2(i, 0)))
                str = porterAppendEnd(str, step2(i, 1))
            End If
            Exit For
    End If
Next i

'retuning the word
porterAlgorithmStep2 = str

End Function

Private Function porterAlgorithmStep3(str As String) As String

On Error Resume Next

'STEP 3
'
'    (m>0) ICATE ->  IC              triplicate     ->  triplic
'    (m>0) ATIVE ->                  formative      ->  form
'    (m>0) ALIZE ->  AL              formalize      ->  formal
'    (m>0) ICITI ->  IC              electriciti    ->  electric
'    (m>0) ICAL  ->  IC              electrical     ->  electric
'    (m>0) FUL   ->                  hopeful        ->  hope
'    (m>0) NESS  ->                  goodness       ->  good


'declaring local variables
Dim i As Byte
Dim temp As String
Dim step3(6, 1) As String

'initializing contents of 2D array
step3(0, 0) = "icate"
step3(0, 1) = "ic"
step3(1, 0) = "ative"
step3(1, 1) = ""
step3(2, 0) = "alize"
step3(2, 1) = "al"
step3(3, 0) = "iciti"
step3(3, 1) = "ic"
step3(4, 0) = "ical"
step3(4, 1) = "ic"
step3(5, 0) = "ful"
step3(5, 1) = ""
step3(6, 0) = "ness"
step3(6, 1) = ""

'checking word
For i = 0 To 6 Step 1
    If porterEndsWith(str, step3(i, 0)) Then
            temp = porterTrimEnd(str, Len(step3(i, 0)))
            If porterCountm(temp) > 0 Then
                str = porterTrimEnd(str, Len(step3(i, 0)))
                str = porterAppendEnd(str, step3(i, 1))
            End If
            Exit For
    End If
Next i

'retuning the word
porterAlgorithmStep3 = str

End Function

Private Function porterAlgorithmStep4(str As String) As String

On Error Resume Next

'STEP 4
'
'    (m>1) AL    ->                  revival        ->  reviv
'    (m>1) ANCE  ->                  allowance      ->  allow
'    (m>1) ENCE  ->                  inference      ->  infer
'    (m>1) ER    ->                  airliner       ->  airlin
'    (m>1) IC    ->                  gyroscopic     ->  gyroscop
'    (m>1) ABLE  ->                  adjustable     ->  adjust
'    (m>1) IBLE  ->                  defensible     ->  defens
'    (m>1) ANT   ->                  irritant       ->  irrit
'    (m>1) EMENT ->                  replacement    ->  replac
'    (m>1) MENT  ->                  adjustment     ->  adjust
'    (m>1) ENT   ->                  dependent      ->  depend
'    (m>1 and (*S or *T)) ION ->     adoption       ->  adopt
'    (m>1) OU    ->                  homologou      ->  homolog
'    (m>1) ISM   ->                  communism      ->  commun
'    (m>1) ATE   ->                  activate       ->  activ
'    (m>1) ITI   ->                  angulariti     ->  angular
'    (m>1) OUS   ->                  homologous     ->  homolog
'    (m>1) IVE   ->                  effective      ->  effect
'    (m>1) IZE   ->                  bowdlerize     ->  bowdler
'
'The suffixes are now removed. All that remains is a little tidying up.

'declaring local variables
Dim i As Byte
Dim temp As String
Dim step4(18) As String

'initializing contents of 2D array
step4(0) = "al"
step4(1) = "ance"
step4(2) = "ence"
step4(3) = "er"
step4(4) = "ic"
step4(5) = "able"
step4(6) = "ible"
step4(7) = "ant"
step4(8) = "ement"
step4(9) = "ment"
step4(10) = "ent"
step4(11) = "ion"
step4(12) = "ou"
step4(13) = "ism"
step4(14) = "ate"
step4(15) = "iti"
step4(16) = "ous"
step4(17) = "ive"
step4(18) = "ize"

'checking word
For i = 0 To 18 Step 1

    If porterEndsWith(str, step4(i)) Then
   
            temp = porterTrimEnd(str, Len(step4(i)))
           
            If porterCountm(temp) > 1 Then
           
                If porterEndsWith(str, "ion") Then
                    If porterEndsWith(temp, "s") Or porterEndsWith(temp, "t") Then
                        str = porterTrimEnd(str, Len(step4(i)))
                        str = porterAppendEnd(str, "")
                    End If
                Else
                    str = porterTrimEnd(str, Len(step4(i)))
                    str = porterAppendEnd(str, "")
                End If

            End If
           
            Exit For
           
    End If
   
Next i

'retuning the word
porterAlgorithmStep4 = str

End Function

Private Function porterAlgorithmStep5(str As String) As String

On Error Resume Next

'STEP 5a
'
'    (m>1) E     ->                  probate        ->  probat
'                                    rate           ->  rate
'    (m=1 and not *o) E ->           cease          ->  ceas
'
'STEP 5b
'
'    (m>1 and *d and *L) -> single letter
'                                    controll       ->  control
'                                    roll           ->  roll

'declaring local variables
Dim i As Byte
Dim temp As String


'Step5a
If porterEndsWith(str, "e") Then            'word ends with e
    temp = porterTrimEnd(str, 1)
    If porterCountm(temp) > 1 Then          'm>1
        str = porterTrimEnd(str, 1)
    ElseIf porterCountm(temp) = 1 Then      'm=1
        If Not porterEndsCVC(temp) Then     'not *o
            str = porterTrimEnd(str, 1)
        End If
    End If
End If


'--------------------------------------------------------------------------------------------------------
'
'Step5b
If porterCountm(str) > 1 Then
    If porterEndsDoubleConsonent(str) And porterEndsWith(str, "l") Then
        str = porterTrimEnd(str, 1)
    End If
End If

'retuning the word
porterAlgorithmStep5 = str

End Function

Private Function porterEndsWith(str As String, ends As String) As Boolean

On Error Resume Next

'declaring local variables
Dim length_str As Byte
Dim length_ends As Byte
Dim hold_ends As String

'finding the length of the string
length_str = Len(str)
length_ends = Len(ends)

'if length of str is greater than the length of length_ends, only then proceed..else return false
If length_ends >= length_str Then

    porterEndsWith = False
   
Else

    'extract characters from right of str
    hold_ends = Right(str, length_ends)
   
    'comparing to see whether hold_ends=ends
    If StrComp(hold_ends, ends) = 0 Then
        porterEndsWith = True
    Else
        porterEndsWith = False
    End If
   
End If

End Function

Private Function porterContains(str As String, present As String) As Boolean

On Error Resume Next

'checking whether strr contains present
If InStr(str, present) = 0 Then
    porterContains = False
Else
    porterContains = True
End If

End Function

Private Function porterContainsVowel(str As String) As Boolean

'checking word to see if vowels are present

Dim chars() As Byte
Dim i As Byte
Dim pattern As String

If Len(str) >= 0 Then

    'find out the CVC pattern
    pattern = returnCVCpattern(str)
   
    'check to see if the return pattern contains a vowel
    If InStr(pattern, "v") = 0 Then
        porterContainsVowel = False
    Else
        porterContainsVowel = True
    End If

Else
    porterContainsVowel = False
End If

End Function

Private Function porterEndsDoubleConsonent(str As String) As Boolean

On Error Resume Next

'checking whether word ends with a double consonant (e.g. -TT, -SS).

'declaring local variables
Dim holds_ends As String
Dim hold_third_last As String
Dim chars() As Byte

'first check whether the size of the word is >= 2
If Len(str) >= 2 Then

    'extract 2 characters from right of str
    holds_ends = Right(str, 2)
   
    'converting string to byte array
    chars = StrConv(holds_ends, vbFromUnicode)
   
    'checking if both the characters are same
    If chars(0) = chars(1) Then
   
        'check for double consonent
        If holds_ends = "aa" Or holds_ends = "ee" Or holds_ends = "ii" Or holds_ends = "oo" Or holds_ends = "uu" Then
           
            porterEndsDoubleConsonent = False
           
        Else
       
            'if the second last character is y, and there are atleast three letters in str
            If holds_ends = "yy" And Len(str) > 2 Then
           
                'extracting the third last character
                hold_third_last = Right(str, 3)
                hold_third_last = Left(str, 1)
               
                If Not (hold_third_last = "a" Or hold_third_last = "e" Or hold_third_last = "i" Or hold_third_last = "o" Or hold_third_last = "u") Then
                   
                    porterEndsDoubleConsonent = False
                   
                Else
               
                    porterEndsDoubleConsonent = True
                   
                End If
           
            Else
           
                porterEndsDoubleConsonent = True
               
            End If
           
        End If
   
    Else
   
        porterEndsDoubleConsonent = False
       
    End If
   
Else

    porterEndsDoubleConsonent = False
   
End If

End Function

Private Function porterEndsCVC(str As String) As Boolean

On Error Resume Next

'*o  - the stem ends cvc, where the second c is not W, X or Y (e.g. -WIL, -HOP).

'declaring local variables
Dim chars() As Byte
Dim const_vowel As String
Dim i As Byte
Dim pattern As String

'check to see if atleast 3 characters are present
If Len(str) >= 3 Then
   
    'converting string to byte array
    chars = StrConv(str, vbFromUnicode)
   
    'find out the CVC pattern
    pattern = returnCVCpattern(str)
   
    'we need to check only the last three characters
    pattern = Right(pattern, 3)
     
    'check to see if the letters in str match the sequence cvc
    If pattern = "cvc" Then
        If Not (Chr(chars(UBound(chars))) = "w" Or Chr(chars(UBound(chars))) = "x" Or Chr(chars(UBound(chars))) = "y") Then
            porterEndsCVC = True
        Else
            porterEndsCVC = False
        End If
    Else
        porterEndsCVC = False
    End If
   
Else

    porterEndsCVC = False

End If

End Function
Private Function porterTrimEnd(str As String, length As Byte) As String

On Error Resume Next

'returning the trimmed string
porterTrimEnd = Left(str, Len(str) - length)

End Function

Private Function porterAppendEnd(str As String, ends As String) As String

On Error Resume Next

'returning the appended string
porterAppendEnd = str + ends

End Function

Private Function porterCountm(str As String) As Byte

On Error Resume Next

'A \consonant\ in a word is a letter other than A, E, I, O or U, and other
'than Y preceded by a consonant. (The fact that the term `consonant' is
'defined to some extent in terms of itself does not make it ambiguous.) So in
'TOY the consonants are T and Y, and in SYZYGY they are S, Z and G. If a
'letter is not a consonant it is a \vowel\.

'declaring local variables
Dim chars() As Byte
Dim const_vowel As String
Dim i As Byte
Dim m As Byte
Dim flag As Boolean
Dim pattern As String

'initializing
const_vowel = ""
m = 0
flag = False


If Not Len(str) = 0 Then

    'find out the CVC pattern
    pattern = returnCVCpattern(str)
   
    'converting const_vowel to byte array
    chars = StrConv(pattern, vbFromUnicode)
   
    'counting the number of m's...
    For i = 0 To UBound(chars) Step 1
        If Chr(chars(i)) = "v" Or flag = True Then
            flag = True
            If Chr(chars(i)) = "c" Then
                m = m + 1
                flag = False
            End If
        End If
    Next i
   
End If

porterCountm = m

End Function

Private Function returnCVCpattern(str As String) As String

'local variables
Dim chars() As Byte
Dim const_vowel As String
Dim i As Byte

'converting string to byte array
chars = StrConv(str, vbFromUnicode)
   
'checking each character to see if it is a consonent or a vowel. also inputs the information in const_vowel
For i = 0 To UBound(chars) Step 1
   
    If Chr(chars(i)) = "a" Or Chr(chars(i)) = "e" Or Chr(chars(i)) = "i" Or Chr(chars(i)) = "o" Or Chr(chars(i)) = "u" Then
        const_vowel = const_vowel + "v"
    ElseIf Chr(chars(i)) = "y" Then
        'if y is not the first character, only then check the previous character
        If i > 0 Then
            'check to see if previous character is a consonent
            If Not (Chr(chars(i - 1)) = "a" Or Chr(chars(i - 1)) = "e" Or Chr(chars(i - 1)) = "i" Or Chr(chars(i - 1)) = "o" Or Chr(chars(i - 1)) = "u") Then
                const_vowel = const_vowel + "v"
            Else
                const_vowel = const_vowel + "c"
            End If
        Else
            const_vowel = const_vowel + "c"
        End If
    Else
        const_vowel = const_vowel + "c"
    End If
       
Next i
   
returnCVCpattern = const_vowel

End Function
User avatar
Geoff
i like jazz
 
Posts: 5678
Joined: Sat Dec 26, 2009 8:33 am

Postby banquo » Thu May 24, 2018 9:09 pm

thanks, geoff.

the split and porter alg are both pretty neat. honestly just when Draw even called it porter stemming, it helped my understanding of it and helped me find pre-made programs i could enter a large list into (if i needed it).

there's a bit to get through, so i may have follow-up questions later on. i appreciate how you found these bits to share, it's impressive to me that people are able to find a way through excel to do some of these things
User avatar
banquo
 
Posts: 3316
Joined: Tue Dec 15, 2009 8:28 pm

Postby Geoff » Tue May 29, 2018 6:05 pm

cool
User avatar
Geoff
i like jazz
 
Posts: 5678
Joined: Sat Dec 26, 2009 8:33 am

Postby Geoff » Fri Jun 01, 2018 6:37 pm

Jsn wrote:Im very appreciative. Sorry if that didn’t come across. If you reupload i will need to check with IT to cma before downloading but i think its worth it so yes please reupload


here it is http://s000.tinyupload.com/index.php?fi ... 7953624852
User avatar
Geoff
i like jazz
 
Posts: 5678
Joined: Sat Dec 26, 2009 8:33 am

PreviousNext

Return to Mamma Mia... Here We Go Again....

Who is online

Users browsing this forum: average deceiver, boatwave, captain, galactagogue, Google [Bot], Google Feedfetcher, Grey Poupon, hired goon, Hutch, jalapeño ranch, mego, MikeS, my piano, Plainsong, Repo, ruse, shirts optional, trampoline, waldojeffers1