|
|
Line 1: |
Line 1: |
| The '''SMART (System for the Mechanical Analysis and Retrieval of Text) Information Retrieval System''' is an [[information retrieval]] system developed at [[Cornell University]] in the 1960s. Many important concepts in information retrieval were developed as part of research on the [ftp://ftp.cs.cornell.edu/pub/smart/ SMART] system, including the [[vector space model]], [[relevance feedback]], and [[Rocchio Classification]].
| | Introduction! I am Dalton. Acting is a thing that totally addicted to. My house is now in Vermont and I don't software on changing it. I am a cashier. I'm not sensible at webdesign but additionally post want to check our website: http://prometeu.net<br><br>Have a look at my blog; clash of clans hack tool ([http://prometeu.net the full details]) |
| | |
| [[Gerard Salton]] led the group that developed SMART. Other contributors included [[Mike Lesk]].
| |
| | |
| The SMART system also provides a set a corpora, queries and reference rankings, taken from different subjects, notably
| |
| * [ftp://ftp.cs.cornell.edu/pub/smart/adi ADI]: publications from information science reviews
| |
| * [ftp://ftp.cs.cornell.edu/pub/smart/cacm CACM]: computer science
| |
| * [ftp://ftp.cs.cornell.edu/pub/smart/cran/ Cranfield collection]: publications from aeronautic reviews
| |
| * [ftp://ftp.cs.cornell.edu/pub/smart/cisi CISI]: library science
| |
| * [ftp://ftp.cs.cornell.edu/pub/smart/med/ Medlars collection]: publications from medical reviews
| |
| * [ftp://ftp.cs.cornell.edu/pub/smart/time/ Time magazine collection]: archives of the generalist review [[Time (magazine)|''Time'']] in 1963
| |
| | |
| To the legacy of the SMART system belongs the so-called SMART notation, a mnemonic scheme for denoting [[tf-idf]] weighting variants in the vector space model. The mnemonic for representing a combination of weights takes the form ddd.qqq, where the first three letters represents the term weighting of the document vector and the second three letters represents the term weighting for the query vector. The letter representation for a term, <math> t </math>, and document, <math> d </math>, is as follows:<ref>{{Citation
| |
| | last = Manning
| |
| | first = Christopher D.
| |
| | last2 = Raghavan
| |
| | first2 = Prabhakar
| |
| | last3 = Schütze
| |
| | first3 = Hinrich
| |
| | title = Introduction to Information Retrieval
| |
| | publisher = [[Cambridge University Press]]
| |
| | year = 2008
| |
| | chapter = Document and query weighting schemes
| |
| | chapter-url = http://nlp.stanford.edu/IR-book/html/htmledition/document-and-query-weighting-schemes-1.html
| |
| | url = http://nlp.stanford.edu/IR-book/
| |
| }}</ref>
| |
| | |
| {|border="1" cellpadding="5" cellspacing="0" align="center"
| |
| |-
| |
| ! scope="col" | Term frequency
| |
| ! scope="col" | Document frequency
| |
| ! scope="col" | Normalization
| |
| |-
| |
| |n (natural): <math>\text{tf}_{t,d} </math>
| |
| |n (no): 1
| |
| |n (none): 1
| |
| |-
| |
| |l (logarithm): 1+log(<math>\text{tf}_{t,d}</math>)
| |
| |t (idf): log<math>\tfrac{N}{df_{t}} </math>
| |
| |c (cosine): <math> \tfrac{1}{\sqrt{w_1^2 + w_2^2 + ... + w_M^2}} </math>
| |
| |-
| |
| |a (augmented): 0.5 + <math>\tfrac{0.5 \times \text{tf}_{t,d}}{\text{max(tf}_{t,d})}</math>
| |
| | p (prob idf): <math>\textbf{max}\left( 0,\text{log}\tfrac{N-df_{t}}{df_{t}} \right) </math>
| |
| |b (byte size): <math>1/CharLength^\alpha , \alpha < 1 </math>
| |
| |-
| |
| |b (boolean): <math>\begin{cases} 1, & \text{if tf}_{t,d} > 0 \\
| |
| 0, & \text{otherwise}
| |
| \end{cases}
| |
| </math>
| |
| |-
| |
| |L (log average): <math> \tfrac{1+\text{log}(\text{tf}_{t,d})}{1+\text{log}(\text{ave}_{t \epsilon d}( \text{tf}_{t,d}))}</math>
| |
| |}
| |
| where tf<math>_{t,d} </math> is the term frequency of term <math> t </math> in document <math> d </math>.
| |
| | |
| == References ==
| |
| {{Reflist}}
| |
| | |
| == External links ==
| |
| * [ftp://ftp.cs.cornell.edu/pub/smart/ Software and test collections] (FTP at [[Cornell University]])
| |
| * [http://tesla.tcnj.edu/SMART/index.php Interactive SMART tutorial]
| |
| * [http://www.tcnj.edu/~mmmartin/EThul/SMART/ SMART case study - Eric Thul]
| |
| * [http://www.tcnj.edu/~mmmartin/CSC485IMME321/Papers/SMART/SmartCourse.html SMART tutorial for beginners - Hans Paijimas]
| |
| | |
| [[Category:Discontinued software]]
| |
| [[Category:Search engine software]]
| |
| | |
| | |
| {{compu-soft-stub}}
| |
Introduction! I am Dalton. Acting is a thing that totally addicted to. My house is now in Vermont and I don't software on changing it. I am a cashier. I'm not sensible at webdesign but additionally post want to check our website: http://prometeu.net
Have a look at my blog; clash of clans hack tool (the full details)