Lanczos approximation: Difference between revisions

From formulasearchengine
Jump to navigation Jump to search
The reflection formula was implemented in the python example incorrectly. Division was changed to multiplication.
en>Yobot
m Introduction: WP:CHECKWIKI error fixes using AWB (10093)
 
Line 1: Line 1:
The '''hash join''' is an example of a [[Join (SQL)|join algorithm]] and is used in the implementation of a [[relational database|relational]] [[database management system]].


The task of a join algorithm is to find, for each distinct value of the join attribute, the set of [[Tuple#Relational model|tuples]] in each relation which have that value.


Hash joins require an [[equijoin]] predicate (a predicate comparing values from one table with values from the other table using the equals operator '=').
I am Emmanuel though I don't truly like being known as like that. Invoicing is her occupation but she's already utilized for an additional one. [http://Www.Wisconsin.com/ Wisconsin] is where his home is. What she enjoys performing is chess and she's been performing it for quite a while. He's been working on his web site for some time now. Check it out here: http://bioingenios.ira.[http://mondediplo.com/spip.php?page=recherche&recherche=cinvestav cinvestav].mx:81/tallerBS2012/index.php/You_Can_Have_Your_Cake_And_Nya_Internet_Casinon_Too<br><br>Here is my blog :: [http://bioingenios.ira.cinvestav.mx:81/tallerBS2012/index.php/You_Can_Have_Your_Cake_And_Nya_Internet_Casinon_Too nya internet svenska casino på nätet]
 
== Classic hash join ==
The classic hash join algorithm for an [[Join_(SQL)#Inner_join|inner join]] of two relations proceeds as follows:
* First prepare a [[hash table]] of the smaller relation. The [[hash table]] entries consist of the join attribute and its row. Because the hash table is accessed by applying a [[hash function]] to the join attribute, it will be much quicker to find a given join attribute's rows by using this table than by scanning the original relation.
* Once the [[hash table]] is built, scan the larger relation and find the relevant rows from the smaller relation by looking in the [[hash table]].
The first phase is usually called the '''"build" phase''', while the second is called the '''"probe" phase'''. Similarly, the join relation on which the hash table is built is called the "build" input, whereas the other input is called the "probe" input.
 
This algorithm is simple, but it requires that the smaller join relation fits into memory, which is sometimes not the case. A simple approach to handling this situation proceeds as follows:
 
# For each tuple <math>r</math> in the build input <math>R</math>
## Add <math>r</math> to the in-memory hash table
## If the size of the hash table equals the maximum in-memory size:
### Scan the probe input <math>S</math>, and add matching join tuples to the output relation
### Reset the hash table
# Do a final scan of the probe input <math>S</math> and add the resulting join tuples to the output relation
 
This is essentially the same as the [[block nested loop]] join algorithm. This algorithm scans <math>S</math> more times than necessary.
 
== Grace hash join ==
A better approach is known as the "grace hash join", after the GRACE database machine for which it was first implemented.
 
This algorithm avoids rescanning the entire <math>S</math> relation by first partitioning both <math>R</math> and <math>S</math> via a hash function, and writing these partitions out to disk. The algorithm then loads pairs of partitions into memory, builds a hash table for the smaller partitioned relation, and probes the other relation for matches with the current hash table. Because the partitions were formed by hashing on the join key, it must be the case that any join output tuples must belong to the same partition.
 
It is possible that one or more of the partitions still does not fit into the available memory, in which case the algorithm is recursively applied: an additional orthogonal hash function is chosen to hash the large partition into sub-partitions, which are then processed as before. Since this is expensive, the algorithm tries to reduce the chance that it will occur by forming as many partitions as possible during the initial partitioning phase.
 
== Hybrid hash join ==
The hybrid hash join algorithm<ref>{{cite journal <!-- This is a conference proceedings, but it's also a journal. -->
  | last=DeWitt
  | first=D.J.
  | coauthors=Katz, R.; Olken, F.; Shapiro, L.; Stonebraker, M.; Wood, D.
  | title=Implementation techniques for main memory database systems
  | volume=14
  | issue=4
  | pages=1–8
  | journal=Proc. ACM SIGMOD Conf
  | doi=10.1145/971697.602261
  |date=June 1984 }}</ref> is a refinement of the grace hash join which takes advantage of more available memory. During the partitioning phase, the hybrid hash join uses the available memory for two purposes:
# To hold the current output buffer page for each of the <math>k</math> partitions
# To hold an entire partition in-memory, known as "partition 0"
Because partition 0 is never written to or read from disk, the hybrid hash join typically performs fewer I/O operations than the grace hash join. Note that this algorithm is memory-sensitive, because there are two competing demands for memory (the hash table for partition 0, and the output buffers for the remaining partitions). Choosing too large a hash table might cause the algorithm to recurse because one of the non-zero partitions is too large to fit into memory.
 
== Hash anti-join ==
Hash joins can also be evaluated for an anti-join predicate (a predicate selecting values from one table when no related values are found in the other). Depending on the sizes of the tables, different algorithms can be applied:
 
=== Hash left anti-join ===
 
* Prepare a [[hash table]] for the '''NOT IN''' side of the join.
* Scan the other table, selecting any rows where the join attribute hashes to an empty entry in the hash table.
 
This is more efficient when the '''NOT IN''' table is smaller than the '''FROM''' table
 
=== Hash right anti-join ===
 
* Prepare a hash table for the '''FROM''' side of the join.
* Scan the '''NOT IN''' table, removing the corresponding records from the hash table on each hash hit
* Return everything that left in the hash table
 
This is more efficient when the '''NOT IN''' table is larger than the '''FROM''' table
 
== Hash semi-join ==
 
Hash semi-join is used to return the records found in the other table. Unlike plain join, it returns each matching record from the leading table only once, not regarding how many matches are there in the '''IN''' table.
 
As with the anti-join, semi-join can also be left and right:
 
=== Hash left semi-join ===
 
* Prepare a hash table for the '''IN''' side of the join.
* Scan the other table, returning any rows that produce a hash hit.
 
The records are returned right after they produced a hit. The actual records from the hash table are ignored.
 
This is more efficient when the '''IN''' table is smaller than the '''FROM''' table
 
=== Hash right semi-join ===
 
* Prepare a hash table for the '''FROM''' side of the join.
* Scan the '''IN''' table, returning the corresponding records from the hash table and removing them
 
With this algorithm, each record from the hash table (that is, '''FROM''' table) can only be returned once, since it's removed after being returned.
 
This is more efficient when the '''IN''' table is larger than the '''FROM''' table
 
==References==
<references />
 
==External links==
* {{Cite journal |title=An Adaptive Hash Join Algorithm for Multiuser Environments
|url=http://www.vldb.org/conf/1990/P186.PDF |format=PDF|author1=Hansjörg Zeller |author2=Jim Gray |author2-link=Jim Gray (computer scientist) |journal=Proceedings of the 16th VLDB conference |place=Brisbane |year=1990 |pages=186–197 |accessdate=2008-09-21 |postscript=<!--None-->}} {{Dead link|date=October 2010|bot=H3llBot}}
 
==See also==
[[Symmetric Hash Join]]
 
{{DEFAULTSORT:Hash Join}}
[[Category:Hashing]]
[[Category:Join algorithms]]

Latest revision as of 15:22, 5 May 2014


I am Emmanuel though I don't truly like being known as like that. Invoicing is her occupation but she's already utilized for an additional one. Wisconsin is where his home is. What she enjoys performing is chess and she's been performing it for quite a while. He's been working on his web site for some time now. Check it out here: http://bioingenios.ira.cinvestav.mx:81/tallerBS2012/index.php/You_Can_Have_Your_Cake_And_Nya_Internet_Casinon_Too

Here is my blog :: nya internet svenska casino på nätet