These "Falsehoods programmers believe about XYZ" sites are interesting and all, but the practical take away always seems to be:
"In order to handle all of these clever edge cases, your [Time|Date|Name|Address|Gender] handling code should be 2,000 lines long, not 2 lines long."
Which is not very helpful if you're just trying to get your product out. Sure, you could take another week to make sure you handle that hamlet in Macedonia whose name isn't a text string, but then again, why not just slap in "address" "city" "state" and "zip" fields and get on with the rest of your actual product? The naive solution covers pretty much all of the USA market. Hell, call those last two "state/province" and "zip/postcode" and you've got most of the world handled. Fix the edge cases in V2, if you even have to.
I think you're getting the wrong take away from these articles.
The message is that the real world is messy so don't spend 2000 lines of code trying to validate it to a strict (simplistic) model.
For example my bank's website couldn't deal with the fact that my street address was (representative example) "5/3 Hawthorn Street" because the '/' was an illegal character (as was -,: and a whole bunch of other separators I tried) - but that's how it appears in the Post Office's database. Some programmer has spent effort to implement the code to reject my legal (and incredibly common) style of address. It would have been far less effort not to have done that.
By contrast the DVLA website just gives you a free input text area, perfect. Minimal effort for them, zero hassle for me.
allowing free form addresses and not validating can cause you a lot of nasty problems - at Reed Elsevier we had some tricky bugs cased by user entered addresses crashing the cms.
4) Every nation has some defined set of borders. (The Yemen-Saudi Arabia border was only recently agreed, and was previously left deliberately ambiguous.)
5) Countries do not overlap. (Counterexample: France and Spain agree they both own Pheasant Island in the Pyrenees. The administration cycles every six months.)
6) Borders are at fixed coordinates. (Except for some borders defined by a river, which change when the river shifts its course. Some subnational borders are certainly like this.)
7) A nation's borders can be drawn with a single stroke of the pen. (Counterexample: South Africa takes two, one for the outside and one for its enclave Lesotho.)
Postal addresses correspond to political boundaries.
(Postal addresses are closer to delivery routes; a ZIP code can cover multiple, distinct, non-adjoining towns in a non-compact area; the 'town' in the postal address can be different from the town in which the address is located; ZIP codes can change if the delivery routes are changed.)
Not surprised about the countries, considering there is not even one definitive way of naming the continents[1]. In fact, the division I know for America is not mentioned in the Wikipedia page.
Don't know about probably_wrong, but I learned that the americas are divided into South, Central and North, with Mexico usually in Central America. Got it in geography class in Brazil.
I've never seen this definition, but I've seen two different ones that were often confused: the division in North, Central and South Americas where Mexico belongs to North America, and the division in Anglo-Saxonic and Latin Americas where Mexico belogs to the latter.
Are you sure you're not mixing them up? I'm asking because I've seen people mix them up so frequently, in some cases you can even find notices in geoography books making this clear.
Central America is a subset of North America, but its not defined particularly consistently. Mexico belongs to North America, and it may also be part of Central America, depending on which definition is used (the major difference between the most common definitions of Central America is whether or not Mexico is included.)
I was referring to Carioca's view, which I also learned in school.
There's another one, too, over which I've had internet arguments before. In that one, the USA and Canada are "America", while the rest of the continent is "South America".
The given reason was: it is not called "United States of North America", it's just "America". The rest of the countries are south from there, so all that region is "South America". I don't know how extended this vision is, though (nor I know where it fits Mexico).
RE 2 the ITU/ISO has a definitive list of 2 and 3 letter codes to define countrys. It is certainly the source I used when building the country standing data tables for a BT billing system back in the day.
That is a ISO standard definitive list of places, some of them are generally not considered to be countries (e.g. SJ Svalbard and Jan Mayen, VG British Virgin Islands) and some claim to be countries but are not recognised by some other countries, (e.g. TW Taiwan and PS Palestine, EH Western Sahara)
Apologies for bringing in hotly disputed territories of the world, but that's basically what this thread is about. I'm not making a political argument for or against any of those places, just saying that there is disagreement.
That said, I would (and do) use ISO two letter codes to identify territories in my code.
In this application we had to use both the two and 3 digit alpha ones plus the numerical ones (for telex/fax billing).
I also recall receiving a sad email from my colleges in Yugoslavia describing how the country was splitting up and how we where to adjust our routing tables.
The list includes all overseas departements of France separately, leaving one to wonder whether you should choose "Réunion" or "France" every time you order something from abroad (I always used "France" and it works fine, it just means at least four entries in all country lists are useless). It also means France gets to have a lot more TLDs than most countries, which is nice.
The one about "buildings do not move" is hilarious. People hardly believe this if you tell them about that. But this happens quite often now (although not always by such a large distance). Sometimes it's cheaper to move a building and being able to then use the free place for something's that is more profitable (protection of historical monuments vs. money made).
But I don't know if such a moved building necessarily keeps its address.
In many languages (including German), "real estate" translates roughly to "unmovable goods". I like to think that they decided to move that building just for the language joke.
In English, the legal concept of "real estate" is generally called "real property". Which I guess makes my car and my computer and all my other property imaginary.
My uncle and aunt in New Zealand once moved their house to another street. It's quite common there, then again most houses are made of wood and easier to move than concrete.
Here's one I've run into a lot that I haven't seen mentioned yet:
- Any location within a US state will also be within a county.
Virginia treats incorporated cities as county-level entities which are not part of any county. It looks like Baltimore, MD is the only example of this outside Virginia, and so it's not encountered often.
When people ask about county, giving the independent city is generally the "correct" answer, but sometimes it leads to confusion. I remember having to explain to a loan officer, who you'd think would be up on the details of the states they operate in, that my "county" was in fact "Alexandria City" and there wasn't any other sort of answer they could get.
Just recently, I recall checking out some project linked here on HN which needed to know what county I lived in. I currently live in Fairfax County in a zip code that also includes the City of Fairfax. After giving it my zip code, it then gave me two choices for county: "Fairfax" and "Fairfax".
This list was clearly made by someone from Switzerland. Other countries have plenty of other crazy examples.
Someone else already pointed out that the city that houses the Dutch government has to different, equally valid names (and is also not the capital).
Not only are there addresses that lack street names, there are also addresses that lack house numbers or towns. It wouldn't surprise me if there are addresses that lack a zip/postal code in a country that normally uses them.
Then there's reassigning addresses to other houses. The original building stays, but gets a new address. I'm a victim of this: a nearby swimming pool had my address until 15 years ago, but I still get a ton of their mail. Clearly in those 15 years, a lot of their contacts never bothered to update their address data. (Or maybe nobody expects a swimming pool to move.)
Edit: Beyond regular enclaves, there's of course also the crazy of Baarle-Nassau and Baarle-Hertog. It's bits of Belgian enclaves in Netherland, some of which in turn contain Dutch enclaves. The country you're in can change every other house. It's possible all your neighbours live in a different country than you. I'm not sure if it's possible a single house is in two different countries at the same time. Maybe the front door in one country and the back door in another? Exactly who gets to maintain the streets, I have no idea.
There are buildings that are in both Canada and the USA. I'm not sure if there are any houses like that but there might be. The two places that this happens are near the Pacific coast, and the border between the province of Quebec and several US states.
A couple other oddities of geography I have noticed as a geospatial system developer.
- Outside the US and Europe things tend to get really wacky with geography and geospatial data. Many GIS products are licensed per country for this very reason.
- In the US, zip codes are the most terrible type of geography to use for any meaningful analysis, as they are constantly changing and do not actually have defined borders. Zip codes are basically just labels that get appended to street segments.
- If you look in the back of the ~200 page pdf that comes with most digital geographic data, you will find an appendix. 99% of the appendix is a section typically called "disputed territories", and 99% of the disputed territories are China claiming to own another nation. I never realized quite the extent of the megalomania until I read a few of those.
Better example could perhaps be The Hague (Den Haag or 's- Gravenhage in dutch). It also an example for the first and third one (dutch doesn't use 's (or des) nor den anymore)
edit: oh and it's also the seat of the government but not the capital
I grew up in a tiny village in rural Virginia (maybe 50 houses in total) and we had no address. Mail got delivered to a post office box that was conveniently in the post office next door. For UPS deliveries, we would literally give them the address, "Next to the post office, on the east side."
Every so often we'd get a delivery driver with a poor sense of direction, and he'd give the package to the people on the other side of the post office instead. Aside from that, it worked great.
My brother once sent a letter by post to a friend; from Yokohama, Japan to suburban Sydney, Australia. He couldn't remember his friends address but he remember the directions to walk from his friend's local post office; so that's what he wrote as the address, the instructions to walk from the post office to his friend's house. The letter was delivered just fine. In fact the story of its delivery made the front page of a national newspaper (Column 8, Sydney Morning Herald, national edition).
I grew up in a house whose street address (before 911) was "House 313 behind the School". UPS was happy with that address, but other delivery carriers had some issues.
USPS didn't do home mail delivery, so the PO Box we kept was fine for them, but a lot of folks insist on a "physical address" and didn't like "House 313 behind the School". 911 has changed this in rural communities by giving a street address that means something to the police & fire, but means little to the people living there.
Rather than documenting all of the incorrect assumptions programmers make by topic, it might be easier to list the ones that they make that are correct.
"In order to handle all of these clever edge cases, your [Time|Date|Name|Address|Gender] handling code should be 2,000 lines long, not 2 lines long."
Which is not very helpful if you're just trying to get your product out. Sure, you could take another week to make sure you handle that hamlet in Macedonia whose name isn't a text string, but then again, why not just slap in "address" "city" "state" and "zip" fields and get on with the rest of your actual product? The naive solution covers pretty much all of the USA market. Hell, call those last two "state/province" and "zip/postcode" and you've got most of the world handled. Fix the edge cases in V2, if you even have to.