The "Fast Lane" Answer
Slow down a second. Before we go any further, we need to determine two things. First, do you mean "open source"or "free?" Second, when you say "address scrubber" are you referring to parsing the address? Or do you mean standardizing it? The distinctions are important,because it changes the answers you need.
You know what? We'll just answer everything.
Open Source vs. Free: Open source means that the source code is visible to the public. Free means there's nopayment required to use it.
Parsing vs. Standardization: Parsing an address means breaking it into smaller chunks and labeling those chunksfor easier processing. Standardizing means reformatting an address so that it looks the way the USPS wants it tolook. The difference is a fine line, though, and most who do one also do the other. Some, however, go astep further and offer address validation, a processthat incorporates the other two.
Below is a better explanation of the differences, and how knowing what to ask for helps you find what you want.
The "Scenic Route" Answer
Open Source vs. Free
Open Source
Open source means that the source code of a program is visible to the public, and that it canbe reviewed and edited by anyone—that is, anyone willing to adhere to the terms of use. (We're developers,not barbarians.) It means that the creators of the code are willing to hear your criticism and consider yourinput. It's the whole "sharing is caring" philosophy applied to computer systems.
The open dialog that exists between developers and users in an open-source format is unique—by opening thesource code to the public, developers are offering users the chance to participate in development. They inviteeveryone to be the developers of the program, to participate in the creative process, and to help makethe code (and by association the program as a whole) better.
Open source, however, does not automatically indicate "free." Unix, for example, is free to use in most forms,but there are other open-source offerings that are not.
Free
A word that here means: without cost, requires no fee to participate/use, charges you nothing, etc. Free is whenyou receive a gift for your birthday, or find a five dollar bill on the ground. But free doesn't always offerfreedom. Sometimes it resembles a borrowing system: "Yeah, you can use my stuff, but keep it the way I had it."Adobe Acrobat reader is free to use. But it's also proprietary software (i.e., not open source),meaning you can't pop the hood and tinker with the engine. Adobe keeps that locked down tight.
So keep in mind what you're looking for. Are you wanting to pay nothing? Or do you want a big sandbox—opento any sort of creation or destruction you wish to bestow upon it?
Address Parsing vs. Address Standardization
Parsing
Parsing is complicated. It tries to determine the intent of content. It has many applications, not justregarding the processing of an address. At its core it's an attempt to have a computer decipher the meaning ofhuman communication. Normally computers work best when we speak to them in their language; parsing is acomputer trying to speak ours.
Time to get down to brass tacks: when a program parses input, it tries to break it into pieces and categorizeit. For instance, if someone were to take an address—Spiderman's house, maybe, 20 Ingram St. Queens,New York
—and parse it, they would run it through a program that has to decide for itself whatportion of the address is the city, which part is the street, and which part is the house number. Then it wouldlabel it: "20" is the house number, "Ingram" is the street, "Queens" is the city, and so forth.
That would be a pretty simple process (any parsing would be straightforward, really), if we as humansweren't so fond of ambiguity and repetition.
Take the above example again. As a human, it's pretty easy to assume that the city is "Queens, New York." Queensis pretty famous, and there's no "Saint Queens," New York. So it's an easy call.
But not everywhere is like that.
Take Helena, California. There are two of them: Helena and St. Helena. So to illustrate the problem here, let'srelocate Spidey to the west coast. That way, when the computer goes to parse 20 Ingram St. Helena,California
, it has no way of knowing if it's Ingram St. in Helena, California, or if it's Ingram whateverin St. Helena, California. That leaves the computer to try to determine intent, and hope for the best. This is aproblem inherent in using regular expressions (aka: regex) for parsing addresses.
Now as hard and as complicated as that is, it still has value. Going on the presumption that the parseraccurately estimated the intent of the entry, it can then run it through a secondary process, as done inindustries like address verification/validation. But more on that later.
So if you're looking to have your address broken down and identified, you're looking for parsing. It's acomplicated method, but it has its payoffs.
Standardization
Standardization is when an address is reformatted to match the standards set by the USPS so it can be processedby them. When run through a standardizing program, addresses will be "cleaned up;" commas will be added betweencity and state if they're missing, words like "street" and "boulevard" will be abbreviated properly, words willbe capitalized in keeping with the USPS system, and so on. In short, it's the process of making an address looklike it was written by the postman himself.
Standardizing doesn't break anything down into components. Nothing is labeled or categorized by the end of theprocess. The word "street" is recognized because it's the word "street," not because of it's position in thesyntax of the address.
So if you're looking to shape up that address list so it looks more professional, you're looking forstandardization.
Validation
The ultimate form of "address scrubbing" is address validation (also known as address verification). Validation runs an address through astring of processes, and does at least three things: it parses the address, it standardizes the address, and itvalidates it. The validation part means that an address is compared against an authoritative database andchecked to see if it is real.
An address currently receiving mail from the postal system will be returned by our system as valid, and is areal address. Addresses that, for whatever reason, do not currently receive mail are marked as invalid. Thatdoesn't necessarily mean that the address is fictional; it just means that it's not currently in the system. Aninvalid address may or may not be real; a valid address is real for sure.
An address has to be standardized before being compared to the system, and the system has the addresses alreadybroken down into parts. So, as part of the validation process, the address will be standardized, then compared,then returned with components accurately labeled.
Some validation providers do parsing in-house, so that even if an address fails to validate they can return itscomponents to you and indicate which part failed the test. Not every provider does this, and even those who docan't guarantee 100% accuracy, due to the nature of parsing. It's simply not a foolproof system. Whether or notthe parsing is done in-house, they can still standardize the address.
Open Source Address Validation
Validation requires access to an authoritative database, and the owners of that data (at least in the US) don'twant their stuff strewn across the internet. So for US addresses, there is no truly open source solution forvalidation.
Conclusion
We hope this article has done its job and clarified some of the details that might be obfuscating your searchfor whichever of the above services you need. Along those lines, we would be remiss if we did not mention how weprovide a significant amount of free address validation (which, asmentioned, includes the parsing/standardizing part), and the stuff you pay for is soreliable and lightning fast it merits the price.
The information delivered in this article was probably overkill, but a wise woman once told us that "nuking itfrom orbit is the only way to be sure."
We're pretty sure we're not taking that advice out of context.
In any case, good luck and happy hunting.
FAQs
Is USPS address validation API free? ›
Customers who have registered for a USPS Web Tools® account and agreed to the terms and conditions of use can access USPS APIs for free.
What is the best way to validate addresses? ›The first is to use a USPS® address verification tool. These tools can verify both US-based addresses as well as international addresses in batches. The second method is to use an address validation API. This technical tool connects to your website, application, or other technology to validate addresses in real time.
Why won t my address validate? ›Sometimes, an address will not validate because the address is marked as "vacant" by the USPS. Additionally, a new address, an unregistered address, or one located within a postal code primarily serviced by PO boxes, would all fail to validate.
What is SmartyStreets used for? ›SmartyStreets is headquartered in Provo, Utah and provides enterprise-grade address validation, standardization, and geocoding services in 240+ countries and territories. SmartyStreets processes billions of addresses every month. The company does this through easy-to-use website tools, and fully-documented APIs.
Does Google have an address validation API? ›That's why we're announcing the general availability release of Address Validation, a new API that can help improve user experiences by removing friction at account sign-up or checkout, and save time and money for your business by helping reduce the impact of invalid addresses on your operations.
How can I verify an address for free? ›If you only need to use an address checker to verify USPS addresses occasionally, the USPS Zip Code Lookup tool works to validate US postal addresses one at a time. It's easy enough to use, free, and requires no setup. The tool also returns addresses in USPS standardized format.
What is the most accurate reverse address lookup? ›- Intelius – Editor's Choice – Best Reverse Address Lookup Service.
- Peoplefinders – Runner Up.
- Spokeo – Honorable Mention.
- Instant Checkmate – Best Affordable Trial.
- Truthfinder – Best Advanced Search and Filters.
- Learn more at Intelius.com.
The United States Postal Service maintains a database of 160 million mailing addresses in the United States that is updated monthly. The database contains business, residential, and post office box delivery points.
How does USPS verify an address? ›Providers. The USPS offers address verification directly on their website. Addresses are processed one at a time by typing the address into the provided fields. The USPS also licenses their services to third-party companies that provide the CASS certification in bulk.
What is address validation using USPS API? ›The USPS address validation API gives software developers access to data on over 160 million US postal delivery points as well as ZIP Code and city information. An Address Validation API or verification API is an application programming interface (API) that verifies postal information automatically.
How do I get USPS to validate my address? ›
Phone: National Customer Support Center (NCSC) at (800) 238-3150.
What is US postal address verification API? ›The Address Validation API is a service that accepts an address, identifies the address components, and validates them. It also standardizes the address for mailing and finds the best known lat/long location for it.
How do I get a USPS API? ›Get access to USPS APIs when you register for a Web Tools user account. Once registered, you will automatically receive an email containing your assigned Web Tools User ID. After you receive this email, you can use the following APIs: Price Calculators.