Rensselaer Polytechnic Institute
Civil and Environmental Engineering Department
Jose Holguin-Veras,
Ph.D., P.E. Office room:
JEC 4030, Telephone: 276-6221
Email: jhv@rpi.edu Office
hours: W
Lecture hours: Monday
& Thursdays
Prerequisite: CIVL 2030+DSES4140 or their equivalents
Objective:
1. To introduce the students to advanced econometric techniques involving the estimation of discrete choice models.
2. To introduce the students to estimation of discrete choice models using sampling of alternatives techniques.
The case study
The main objective of the modeling process is to gain insights into the determinants of business location. Among other things, this knowledge provides insights into the most effective mechanisms to attract business and therefore foster economic development.
The analyses are based upon
employment data obtained from the New Jersey Department of Commerce and
Economic Growth and the New Jersey Office of Business Research. This database
contained the information that companies moving into
The 1,017 firms employed an estimated total of 108,000 employees and represented 74 industries as defined by different NAICS (North American Industrial Classification System) codes. The data set included the name of the firm, the year of registration, the destination county and address in New Jersey, the state or country of origin, the number of employees and the SIC number (to two digits), and description of the “line of business.” A substantial number of firms were missing addresses, number of employees, and/or SIC codes. Missing addresses were obtained from the Internet or telephone books. The project team converted SIC two digit codes to NAICS three digits codes, and the missing ones were determined from the “line of business” description. For firms missing information on number of employees, the average number of employees for firms with the same NAICS code was used.
The firms in the database were geolocated using Geographic Information Systems (GIS) for spatial analyses. The resulting GIS was used to generate a set of zoning systems to facilitate the modeling process. A key objective of the zoning systems was to discretize the choices so that—instead of having a continuous of choices in space—the choices are part of a discrete set. The first trials used zoning systems based on ZIP codes and counties. Since no acceptable models resulted from these attempts, the project team created zoning systems based on internally “homogenous” zones, in terms of key socio-economic characteristics such as income. The zoning system used in the lab is one of these zoning systems.
The original database was
complemented with estimates of transportation accessibility, and socio-economic
attributes of the different zones (i.e., population, population density, area,
median income) extracted from the population census. Two different sets of
accessibility measures were considered. The fist one captures the travel
impedances from each zone to the major population centers of
Figure 1: Zoning system used in the analyses

The data set used for estimation included variables that characterize the firm, as well as variables that measure transportation accessibility. The original set of variables is shown in Table 1.
Table 1: Variables used in discrete choice modeling

The type of economic activity was represented using the NAICS (North American Industrial Classification System) codes (for more information see http://www.bls.gov/bls/naics.htm). The original values of the NAICS were further aggregated into six supergroups, as outlined in Table 2. An even larger grouping system that classified the economic activities in either: goods producing (NAICS 11 to 33) and service providing (42 and above) was created.
Table 2: Original NAICS and supergroups (number of observations in parentheses)
Supergroup #1:
21 Mining (0)
22 Utilities (3)
23 Construction (9)
Supergroup #2:
31 Food manufacturing
(70)
32 Wood product
manufacturing (68)
33
Primary metal manufacturing (123)
Supergroup #3:
42 Wholesale Trade (64)
44 Retail trade (55)
45 Sporting goods, hobby, book, and music stores (47)
Supergroup #4:
48 Transportation (30)
49 Postal service and warehousing (109)
Supergroup #5:
51 Information (47)
52 Finance and insurance (100)
53 Real estate and rental and leasing (5)
54 Professional, scientific, and technical
services (53)
Supergroup #6:
55 Management of companies and enterprises
(3)
56 Administrative and support and Waste
management / remediation services (23)
62 Health care and Social assistance (12)
71 Arts, entertainment, and recreation (5)
72 Accommodation and Food services (13)
81 Other services (except Public
administration) (9)
92 Government (5)
Objectives of the analyses:
a) To determine the key variables explaining the business location process
b)
To
define the set of transportation policies that would translate into an increase
in business relocations to the most impoverished areas of
c) To quantify the impacts of these policies to determine their effectiveness
1st part of the lab: Estimation of discrete choice models
using the complete choice set
Data
The data could be downloaded from the course web site. The zip file contains:
a) the input data in TXT format (zs2LIMDEP.TXT)
b) a LIMDEP command to read the data (zs2Input.lim)
c) a LIMDEP command to do a basic set of Transformations (Transformations.lim)
d) a file with the zonal characteristics (ZS2zonaldata.xls)
To read the data in LIMDEP:
a) unzip all the files and save it folder XXXX
b) change Project/Settings/Number of Cells to 19000000
c) open the file zs2Input.lim using LIMDEP and change the folder address to XXXX, save the file when done
d) Run zs2Input.lim from LIMDEP using Run/Run File
e) After it finishes reading the data, usie Run/Run File to run Transformations.lim
f) Save the project so that all transformations and data are saved
To run the example models:
a) run Example.lim from LIMDEP using Run/Run File
b) once you run it, you could try different models
2nd part of lab
a) use sampling of alternatives techniques to estimate the best models obtained in the first part of the lab
b) compare results